DOCOHEIT BESOMS 



IB 210 305 



TH 610 954 



AUTBO? 
TITLE 

INSTITUTION 
SP08S AGENCY 

BEPOBT 80 
POB DATE 
CONTRACT 
NOTE 



Kingsbury, G. Gage; Weiss, David J* 
A Validity Comparison of Adaptive and Conventional 
Strategies for Mastery Testing- 
Minnesota Oniv., Minneapolis. Dept. of Psychology, 
Office of Naval Research, Arlington, ¥a. Personnel 
and Training Research Programs office. 
ONH-RB-B1-3 
Sep 81 

N00014-79-C-0172 
36p. 



EDPS PBICE 
DESCRIPTORS 



IDENTIFIERS 



MF01/PC02 Plus Postage. 

♦Achievement Tests; Biology; *Coaparative Analysis; 
Computer Assisted Testing; Criterion Referenced 
Tests; Discriainant Analysis; Higher 
♦Latent Trait Theory; ^Mastery Tests; 
validity 

♦Adaptive Testing; Tailored Testing; Test Length 



Education; 
Scoring; *Test 



ABSTP ACT 

Conventional aastery tests designed to take optimal 
aastery classifications were compared with fixed-length and 
variable-length adaptive aastery tests. Comparisons between the 
testing procedures were aade across five content areas in an 
introductory biology course froa tests administered to volunteers* 
The criterion was the student f s standing in the course, based on 
examinations and laboratory grades* Results showed adaptive tests 
resulted in mastery classifications acre consistent with final class 
standing than tLose obtained from conventional test. This result was 
observed within individual content areas and for discriminant 
analysis classifications iiade across content areas,. This result was 
also observed f^r two scoring procedures used with the conventional 
tests* Besults indicated that there wss no decrement in the 
performance of the adaptive test when a variable termination rule was 
implemented. Further analyses shows That the adaptive tests 
administered differed from the conventional test for each content * 
area as a function of achievement level* This evidence was used to 
explain why the adaptive tests resulted in more valid decisions than 
the conventional procedure* variable-length adaptive mastery tests 
can provide more valid mastery classifications than "optimal" 
conventional aastery tests while reducing test length an average of 
30* from the length of conventional tests. (Author) 



* Reproductions supplied by EDR3 are the best that can be aade * 

* - from the original document. * 
******************** ********* ****************************************** 



ERIC 



I 



CM 



A VALIDITY COMPARISON OF 
ADAPTIVE AND CONVENTIONAL 
STRATEGIES FOR 
MASTERY TESTING 



G. Gage Kingsbury 
and 

David J. Weiss 



NATIONAL INSTITUTE Of EDUCATION 

EDUCATIONAL RESOURCES INFORMATION 

CENTER <EWCi 
^Thw document has been reproduced as 

received from the pefion or organization 

ongmettng it 
~" Minor changes have been made to improve 

reproduction quality 



• Points of view w opinions stated m ttws docu- 
ment do not necessarily represent office! Nf£ 
position or pokey 



"PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCATION At RESOURCES 
INFORMATION CENTER (ERIC) " 



Research Report 81-3 
September 1981 

Computerized Adaptive Testing Laboratory 
Psychometric Methods Program 
Department of Psychology 
^ University of Minnesota 

* Minneapolis, MN 55455 

Q 

10 This research was supported by funds from the Army 

Research Institute* Air Force Office of Scientific 
Research, Air Force Human Resources Laboratory, and 
the Office of Naval Research, and monitored by the 

Office of Naval Research. 



Approved for public release; distribution unlimited* 
Reproduction in whole or in part is permitted for 
any purpose of the United States Government. 



ERIC 2 



Unclassified 



C .FCU.*UTY CLASSIFICATION OF THIS PACE (When Dete Entered) 



REPORT DOCUMENTATION PAGE 


READ INSTRUCTIONS 
BEFORE COMPLETING FORM 


1 REPORT NUMBER 

Research Report 81-3 


2. GOVT ACCESSION NO. 


3. RECIPIENT'S CATALOG NUMBER 


4 TITLE (end Subtitle) 

A Validity Comparison of Adaptive and 
Conventional Strategies for Mastery Testing 


5 TYPE OF REPORT ft PERIOD COVEREO 

Technical Report 


6. PERFORMING ORG. REPORT NUMBER 


7 AUTMORf*; 

G, Gage Kingsbury and David J. Weiss 


8 CON T RACT OR GRANT NUMBERf«) 

N00014-79-C-0172 


9 PERFORMING ORGANIZATION NAME AND ADDRESS 

Department of Psychology 
University of Minnesota 
Minneapolis, Minnesota 55455 


10. PROGRAM ELEMENT. PROJECT, TASK 
AREA ft WORK UNIT NUMBERS 

P E : 6115N Proi,: RR042-04 
T.A. : RR042-04-01 
W II • NR 1S0-41^ 


It rrtwTDrtl t iur. nccifF MIME AMD ADDRESS 

Personnel and Training Research Programs 
Office of Naval Research 
Arlington, Virginia 22217 


U REPOT- DATE 

September, 1981 


1) NUMBER OF PAGES 

25 


U MONITORING AGENCY NAmE * ADDRESSff/ different from Controlling Office) 


15- SECURITY CLASS, fof report) 


DCCL ASSlFl CATION/ DOWNGRADING 
SCHEDULE 



16 DISTRIBUTION STATEMENT (of thit Report) 



Approved for public release; distribution unlimited. Reproduction in whole 
or in part is permitted for any purpose of the United States Government. 



17 DISTRIBUTION STATEMENT (of the ebetrmct entered In Block 20, If different from Report) 



t$ SUPPLEMENTARY NOTES 

This research was supported by funds from the Army Research Institute, Air 
Force Office of Scientific Research, Air Force Human Resources Laboratory, 
and the Office of Naval Research, and monitored by the Office of Naval Research 



It KEY WORDS (Continue on revet ee wide if neceeemry end Identify by block number) 



mastery testing 
achievement testing 
adaptive testing 
computerized testing 
criterion-referenced testing 



tailored testing 
item response theory 
latent trait theory 
item characteristic curve thi ory 



20 



ABSTRACT (Continue on tevetee elde If neceeemry and Identity by block number) 



Conventional mastery tests designed to make optimal mastery 
classifications were compared with fixed-length and variable-length adaptive 
mastery tests in terms of validity of decisions with respect to an external 
criterion measure. Comparisons betweeu the testing procedures were made across 
five content areas in an introductory biology course from tests administered to 
over 400 volunteer students. The criterion measure used was the student's 
final standing in the course, based on course examinations and laboratory 



ERLC 



DD 



FORM 

1 JAN 71 



1473 EDITION OF 1 NOV *» It OitOLETE 

S/H 0102-LF-014-6601 



Unclassified 



security Classification of this paoi (Whmn Omim *nw*o 



Unclassified 



*ICU*|TY CLASSIFICATION Or THIS PAGE (Whm Dmtm tnf •!■•<) 



grades. Results indicated that the adaptive test resulted in mastery 
classifications that were more consistent with final class standing than those 
obtained from the conventional test* This result was observed within 
individual content areas and for discriminant analysis classifications made 
across content areas* This result wcs also observed for two scoring procedures 
used with the conventional test (proportion-correct and Bayesian scoring) 
Results also indicated that there was no decrement in the performance of the 
adaptive test when a variable termination rule was implemented. This variable 
termination rule resulted in test lengths which were, on the average , 74% to 
88% shorter than the original adaptive tests. FurtLer analyses explicated the 
manner in which the adaptive tests administered differed from the conventional 
test for each content area as a function of achievement level* This evidence 
was used to explain why the adaptive tests resulted in more valid decisions 
than the conventional procedure, in spite of the fact that the type of 
conventional test used here was the most informative test concerning the 
mastery cutoff* It is concluded that variable-length adaptive mastery tests 
can provide more valid mastery classifications than "optimal*' conventional 
mastery tests while reducing test length an average of 80% From the length of 
the conventional tests. 



ERLC 



Unclassified 



SfCUNlTV CLASSIFICATION OF THIS MOIf#*»n 1««H> 



Contents 



Introduction • • 1 

Method 2 

Subjects ** 2 

Test Administration .......... 2 

Classroom Mastery « 3 

Test Construction «••••• 3 

Mastery Level * 3 

Adaptive Tests 4 

Conventional Tests • 4 

Scoring . . .* * * • • 5 

.Analyses ........................................... 5 

Comparison of the Tests Given 5 

Comparison of Test Validities 6 

Results ........ , 6 

Comparison of the Tests Given • * • 6 

Test Overlap . . . 6 

Effect of Variable Termination * • . ... 8 

Information 9 

Comparison of Test Validities , 11 

Subtest Validities ... .. . «... 11 

Discriminant Function Analysis within Testing Sessions 13 

Discriminant Functions across Testing Sessions . ... 15 

Discussion and Conclusions • • 18 

References • 18 

Appendix: Supplementary Tables ............................................ * 19 



ERIC 



5 

5 




A Validity Comparison of Adaptive and 
Conventional Strategies for Mastery Testing 



\ 



The adaptive mastery testing (AMT) procedure developed by Kingsbury and 
Weiss (1979) is designed to make high-precision classifications concerning stu- 
dents 4 mastery of specific content areas within a course of instruction. The 
procedure is also intended to minimize the number of test questions needed to 
make these classif icationr in order to increase the amount of class time avail- 
able for actual instruction. The AMT procedure makes use of item response theo- 
ry (IRT; Lord, 1980; Lord & Novick, 1968) to adapt the test items administered 
to suit each student* The AMT procedure was compared in monte carlo simulation 
(Kingsbury & Weiss, 1980) to a sequential decision procedure developed by Wald 
(1947) and to a conventional mastery decision procedure. This simulation indi- 
cated that the AMT procedure resulted in the most valid mastery classifications 
of the three methods across most conditions examine 

The present study was designed to further investigate the properties of the 
AMT procedure and to compare it with a conventional mastery test with optimal 
information characteristics. This comparison is of interest for practical, as 
well as theoretical, reasons. If it were found that a conventional test with 
certain design characteristics could make mastery classifications as well as or 
better than the AMT procedure, it would probably be more economical to employ 
the conventional paper-and-pencil testing procedure in most classroom situations 
(although the rapid proliferation of inexpensive computers is quickly reducing 
the economic advantage of paper-and-pencil testing). 

This study was designed to address three basic questions concerning the 
performance of these testing procedures within the context of a live-testing 
situation, using currently available items for which IRT parameter values had 
previously been estimated. The first question addressed was whether or not the 
testing procedure chosen made a difference in terms of the set of test items 
given to the students. Obviously, if the AMT procedure were to select the same 
items as the conventional test for most of the students, the AMT item selection 
procedure would be an unnecessary addition to the testing situation in the 
classroom. To address this question, the overlap in tests generated by the two 
procedures was examined as a function of achievement level* In addition, the 
theoretical information available from che questions administered by the two 
testing strategies was examined as a function of achievement level. 

The second question addressed in this study concerned the criterion-related 
validity of the mastery classifications made by the two testing procedures. To 
the extent that one testing procedure results in mastery classifications more 
adequately reflecting some real criterion of performance, that procedure could 
be designated as a more valid testing paradigm. 

The final question concerned the effect of the variable termination crite- 
rion for the AW procedure. This termination criterion is based on the use of 
Bayesian confidence intervals with certain characteristics and should result in 
shorter overall test lengths. It is of some practical interest to determine how 
much test length would be reduced by the use of the AMT termination procedure in 



ERIC 



6 



a live-testing situation, Jn addition, it might be expected that the variable 
termination criterion would affect the validity of decisions made by the AMT 
procedure* The strength of this expected effect was also examined* 

Method 

Subjects 

Data were obtained from students enrolled in an introductory biology course 
at the University of Minnesota during fall quarter 1979. Volunteers were re- 
cruited to take experimental computerized tests, covering the same material as 
would be covered in course examinations, prior to their classroom midquarter and 
final exams. Administration of the computerized tests began three weeks prior 
to the actual classroom exams. Students received one point, which was added to 
their final course grade, for taking one computerized test and an additional two 
points for participating in both the midquarter and final computerized testing 
sessions. Students were assigned sequentially to either an adaptive or a con- 
ventional testing condition. From the testing session prior to the midquarter, 
conventional test data were obtained from 237 students and adaptive test data 
from 237 students. From the testing sessions prior to the final exam, conven- 
tional test data were obtained from 226 students and adaptive test data were 
obtained from 226 students. 

In addition to the computerized test data collected from these students, 
classroom exam and laboratory scores were also available for most of these stu- 
dents. These classroom scores were used in the analysis of the criterion-re- 
lated validity of the various testing procedures. For this analysis of criteri- 
on-related validity, both classroom data and computerized testing data were 
available for 214 students in the conventional testing condition during the 
first testing session (prior tu the midquarter exam), 213 students in the adap- 
tive testing condition during the first testing session, 209 students in the 
conventional testing condition during the second testing session (prior to the 
final exam), and 219 students in the adaptive testing condition during the sec- 
ond testing session. 

Test Administration 

After assignment to either the conventional or the adaptive testing condi- 
tion, the student was administered two or three subtests, which were adminis- 
tered by a cathode-ray terminal linked to a minicomputer system. During the 
first testing session, students took three 20-item subtests designed to evaluate 
their knowledge of the Chemistry, Cell Structure, and Energy content areas, 
which were taught In the biology class prior to the midquarter exam. During the 
second testing session, students were administered two 20-item subtests that 
were designed to evaluate their knowledge of the Genetics and Re product ion- 
/Embryology content areas, which were taught In the biology class following the 
midquarter exam and prior to the final exam. 

Each of the questions administered to the students during these experimen- 
tal testing sessions was in four-alternative multiple-choice format* The pools 
of test questions had been gathered from the questions that had been used in 
classroom examinations previously and therefore were representative of the con- 
tent being taught in the classroom. 



- 3 - 



Item Pools 

The five item pools developed to measure student achievement in the five 
content areas of interest were composed of examination questions that were ad- 
ministered in the general Biology course during the 1975-1976 and 1976-1977 aca- 
demic years. The items were parameterized within their respective content areas 
using the procedure described by Urry (1976). This procedure estimates the dis- 
crimination power (a), difficulty (b), and guessing level (c) parameters re- 
quired for the use of the three-parameter logistic IRT model (Lord, 1980; Lord 6. 
Novick, 1968). This calibration procedure is described in detail by Be jar, 
Weiss 'and Kingsbury (1977). The sample sizes used for parameter estimation 
varied from approximately 800 to 1,200 students. Final item pool sizes ranged 
from 51 items, for the Reproduction/Embryology content area, to 87 items, for 
the Energy content area. Item identification numbers and IRT item parameter 
estimates for the items used in each content area item pool are shown in Apren- 
dix Tables A through E. 

Classroom Mastery 



The validity criterion for evaluation of the mastery classifications made 
by the two testing strategies was a student's course grade, as determined by the 
sum of a student's mldquarter classroom exam score, final classroom exam score, 
and a laboratory grade. The maximum score obtainable was 100 points on each for 
a possible total of 300 points. 

For each student the total score was evaluated to determine his/her mastery 
status on the classroom criterion. A student was declared a master on the 
classroom mastery criterion If he/she received at least 240 out of the possible 
300 points. This criterion corresponds to the 80% cutoff between grades of C 
and B for classroom performance. By the comparison of the students against this 
classroom mastery level, an independent evaluation of students' mastery status 
was obtained that was used to examine the criterion-related validity of each of 
the experimental testing strategies. 

Test Construction 

Master y level . In order to examine how well each testing strategy made 
mastery classifications, it was necessary to establish a reasonable level of 
■-■ performance that would be comparable to the classroom mastery level. It would 

then be necessary to construct the various experimental subtests so that they 
. vould be maximally efficient for making classifications at the specified mastery 
level. 

For a conventional test using proportion-correct scoring, this 80% correct 
mastery level (as used in the classroom) would be sufficient for use in making 
mastery classifications. When an IRT scoring procedure is to be used, the mas- 
tery level must be converted from the proportion-correct metric to the latent 
achievement metric for each content area. Consequently, for each of the five 
content areas, the 80% criterion was converted to the achievement (6) metric by 
use of the test characteristic curve (TCC) for the content area item pools, as 
described by Kingsbury and Weiss (1979). The 9 value on the achievement metric 
that would most likely correspond to the 80% correct mastery level for each con- 
& tent area is shown in Table 1, along with the subject matter designation of each 
content area. 
2> 

IC 8 



Table 1 

Subject Matter Included in Each Content Area, 
and the Achievement Level Used 
as the Mastery Level for Each Content Area 



Content 




Mastery Level 


Area 


Subject Matter 


on the 6 Metric 


1 


Chemistry 


.27 


2 


Cell Biology 


.23 


3 


Energy 


.79 


4 


Genetics 


.73 


5 


Reproduction/Embryology 


.65 



Adaptive tests * The adaptive subtests administered to the students as- 
_signed to the adaptive testin&condition followed the^AMT paradigm ^tescribed^by 
Kingsbury and Weiss (1979) with one exception. As in the earlier study, a stu- 
dent's achievement level was estimated following his/her response to each test 
question using Owen's Bayesian scoring algorithm (Owen, 1969). The student's 
achievement level estimate was then used to select the next item to be adminis- 
tered. Each item remaining in the content area item pool was evaluated in terms 
of its theoretical information (Birnbaum, 1968), and the item that was capable 
of providing the most information at the student's current achievement level 
estimate was chosen to be administered next. In the original AMT paradigm, 
items wire administered to a student until a decision concerning the student's 
mastery level could be made with a certain degree of confidence, and then the 
test was terminated. In this study a fixed subtest length of 20 items was used 
for each content area subtest. Analyses were designed, in part, to test the 
desirability of the us* of the variable-termination rule versus fixed termina- 
tion in this live-testing application of AMT. This procedure also permitted 
comparison of adaptive and conventional tests of the same test length. 

Each student began each of the content area subtests with a Bayesian prior 
distribution for his/her achievement level, which had a variance of 1.0 and a 
mean that was equal to the mastery level for the content area in question. This 
was equivalent to making the assumption that it was equally probable that a stu- 
dent was a master or a nonmaster. 

Conventional tests . For a one-point classification problem like the one 
involved here, the optimal conventional test is made up of that set of k items 
that provides the most information in the vicinity of the achievement level cho- 
sen as the cutting score 9^ where item information is defined as in Birnbaum 
(1968, Equation 20.4.16) and evaluated at 9 * 8 m (Lord, 1980). To operationally 
this design, each item for a particular content area was evaluated in terms of 
its theoretical information at the mastery level for the content area. The 20 
most informative items at the mastery level were chosen to serve as the conven- 
tional test questions for that content area. The order of administration of the 
items to students within each content area was arbitrary, although each student 
in the conventional testing condition received the questions in the same order. 
The parameter estimates for the items that made up the conventional tests for 
each content area are detonated In Appendix Tables A through E. 



Scoring 



Two scores were obtained for each adaptive subtest; the achievement level 
estimate (8) following administration of the 20th item and the achievement level 
estimate at the item at which a 95% Bayesian confidence iaterval surrounding 
that estimate did not include the mastery cutoff on the achievement continuum. 
(For a more detailed description, see Kingsbury & Weiss, 1979 pp. 6-8.) For the 
conventional subtests two scores were computed: the proportion of the subtest 
items answered correctly and the Bayesian estimate of achievement level obtained 
using program Lindsco (Be jar & Weiss, 1979) for each subtest. 

For both the adaptive and conventional tests, a mastery classification for 
each student was made for each subtest. If a student f s achievement level esti- 
mate was greater than or equal to the appropriate mastery level, he/she was de- 
clared a master; if the student's achievement level estimate was less than the 
mastery level, he/she was declared a nonmaster* 



* Analyses 

Comparison of the Tests Given 

To determine whether the two testing strategies resulted in the administra- 
tion of significantly different tests, the percentage of items administered 
within the 20-item AMT that also appeared in the conventional test was calculat- 
ed for each person who took the adaptive tests. This was done separately for 
each of the five content area subtests. The percentage of overlap between the 
two types of tests was then plotted as a function of the estimated achievement 
level. These plots were smoothed by dividing the achievement level continuum 
into 20 approximately equal intervals and by plotting the mean percentage of 
overlap observed for all individuals whose achievement level estimate fell into 
each interval. 

To determine the effect of the variable termination criterion on the per- 
formance of the AMT procedure, frequency distributions were compiled within each 
content area, showing the number of students for whom the AMT procedure would 
have reached its termination point as a function of the number of items adminis- 
tered* The percentage of students for whom the AMT procedure reached a confi- 
dent mastery classification at or before the completion of the 20-item adaptive 
test within eacn content area wis also determined. 

To further compare the tests given by the conventional and adaptive testing 
strategies, information functions were calculated for each cf the testing strat- 
egies within each content area. For each of the conventional tests, the func- 
tion calculated was simply the theoretical test information function (Birnbaum, 
1968) within each subtest, which is the sum of the item information functions 
for the 20-item tests. For the adaptive tests, the information functions were 
approximated by calculating for each person the sum of the item information 
functions for the items administered, evaluated at the final achievement level 
estimate* These information values were then plotted using the smoothing proce- 
dure described above. Adaptive test information functions were calculated for 
the fixed 20-item test length and for the variable-termination condition. 



10 



- 6 - 



Comparison of Test Validities 

As a preliminary test of the validity of each of the four classification 
procedures — mastery status estimated (1) from the conventional test using the 
proportion-correct score, (2) from the conventional test using the Bayesian 
score, (3) from the AMT procedure with the variable-termination criterion, and 
(4) from the AMT procedure with the fixed, 20-item, test length — Pearson prod- 
ucts-moment (phi) correlations were calculated between the mastery status esti- 
mated by the classification procedure and the mastery status observed on the 
classroom performance criterion measure (0 * nonmaster, 1 m master). This was 
done for each classification procedure, for each content area. In addition, the 
frequencies of false mastery classifications and false nonmastery classifica- 
tions were calculated for each classification procedure within each content 
area. 

To further examine the validities of the mastery estimation strategies, 
discriminant function analysis (Tatsuoka, 1971) was used to combine the separate 
content area mastery classifications to more accurately predict the global 
c las sroonT mastery status - criterion. First, groups of 100~ students were "drawn 
from each testing condition within each testing session* A discriminant func- 
tion was calculated for each of these development groups. 

For the first testing session, a student's mastery status estimates from 
each of the three content area subtests taken were used as predictors in a dis- 
criminant function to estimate the student's classroom mastery status* For the 
second testing session, the two content area subtest mastery levels were used to 
estimate the student's overall classroom mastery status. A different prediction 
equation was developed for each different classification procedure* These func- 
tions were then applied to the remainder of the appropriate testing groups in 
order to cross-validate the discriminant functions. Frequencies and types of 
classification errors made by the discriminant functions for each of the testing 
procedures within each testing session were determined for both the development 
and validation groups. 

As a final validity comparison, a discriminant function analysis was con- 
ducted on the subgroups of students who took the same type of test (adaptive or 
conventional) during both testing sessions. This analysis used the mastery 
classifications made in all five content areas to predict the classroom mastery 
level. A one-group discriminant analysis was used here because sample sizes 
were too small to allow for a development group and a cross-validation group* 
Again, frequencies and types of classification errors made were examined for 
each testing procedure. 

Results 

Comparison of the Tests Given 

Test overlap * Figure 1 shows the percentage of items administered to stu- 
dents taking adaptive tests that also appeared on the corresponding conventional 
tests for the first and second testing sessions (Figures la and lb, respective- 
ly). The percentage of overlap is shown as a function of achievement level, as 
estimated by the adaptive testing procedure after 20 items were administered* 
For each content area subtest the achievement level used as the mastery cutoff 
level is indicated. £± 



ERLC 



- 7 - 



Figure 1 

Proportion of Items from the Conventional Test That Were Administered to Students 
Taking the Adaptive Test as a Function of Achievement Level, for Each Content Area 



(Mastery Levels are Indicated as 



^ e a ). 



(a) Content Areas 1, 2, and 3 



i.o - 

.9 - 
.8 



Content Area 1 
— Content Area 2 
— Content Area 3 




Estimated Achievement Level (A) 



1.0 

.9 
. H 



.7 - 



w .5 - 



- .3 - 
.2 

.1 

0.0 



(b) Content Areas 4 and 5 



Content Area 4 

— -- — Content Area 5 



// 



// 

/ / 
/ / 
/ / 

/ / 



y / 



—5^" 



1 I « I I I « ("f" 



1 1 1 1 1 1 ' 1 1 1 1 I 1 I 1 T . . . . 

-X.5 -1.1 -7 -.3 -.1 .1 .3 

Estimated Achievement Level (-) 



•*•*• 



-] .9 



" T_ r T " 
i.i 



\ 



' I 1 I 1 1 

1.5 1.9 



ERIC 



12 



- 8 - 



As Figure 1 shows, for each content area the relationship between the per- 
centage of overlap and the achievement continuum is a unimodal function, peaked 
at moderate achievement levels and much lower at more extreme achievement lev- 
els. Across all content reas the highest proportion of overlap was observed 
for on tent Area 4, and was .88 at an achievement level of approximately 8 » .9 
(Figure Id). The lowest peak overlap observed for any content area was .80, for 
Content Area 2 at an achievement level of approximately § m ,3 (Figure la). For 
these levels of maximum overlap, then, the 20-item adaptive subtests adminis- 
tered an average of 16 to 18 items that appeared on the conventional subtests. 

The lowest level of overlap observed was *05, for Content Area 3 at 
achievement levels of approximately 6 - -1.9 to -1*5 and for Content Area 1 at 
an achievement level of approximately § - -1.9. For these very low achievement 
levels, the average overlap between the ?0-item adaptive and conventional sub- 
tests was about one item. 

Figures la and lb show that the maximum overlap within each content area 
was observed at an achievement level that was quite close to the mastery level 
for the content area. In Content Areas 1 through 3, the mastery level was with- 
in the range on the achievement continuum that, upon application of the smooth- 
ing procedure, was equivalent to the achievement level having the highest level 
of overlap between the adaptive and conventional subtests. Content Area 4, for 
which the mastery level and the peak of overlap were observably different, had a 
mastery level of .73 and an observed overlap peak that occurred at an achieve- 
ment level of approximately .9. For Content Area 5 the masi^ry level was .65, 
whereas t!- . highest observed proportion of overlap occurred at &n achievement 
level of approximately .5. In each of these two content areas, the observed 
difference between the nastery level and uhe approximate achievement level at 
which the highest amount of overlap occurred between the conventional and adap- 
tive tests was less than *2 units on the achievement continuum (about l/2Cth of 
the effective score range for this group of students). 

Thus, these data show that for those achievement level estimates in the 
immediate neighborhood of the mastery level for a particular content area, the 
adaptive procedure resulted in tests that, on the average, were quite similar to 
the conventional tests (differing by only a very few items). At the other ex- 
treme, for achievement level estimates quite discrepant from the mastery level, 
the adaptive testing procedure resulted in tests that, on the average, were very 
different from the conventional tests (having only a very few items in common). 

Effect of variable termination . A "high-confidence" classification is made 
when the Bayes confidence interval around an individual's estimated achievement 
level fails to Include the prespecified mastery cutoff. Table 2 shows the mean 
test length needed to make a high-confidence classification and the percentage 
of students for whom a high-confidence classification was made at or before the 
end of the 20-item adaptive subtest within each content area. It can be seen 
from 'hese data that the mean number of items required to make a confident clas- 
sification ranged from 2.30, in Content Area 5, to 5.23 in Content Area 2* 
These means imply a corresponding reduction in the length of the average test of 
from 73.8% to 88.5% of the original 20-item test length. 

In addition, Table 2 indicates that* the percentage of students for whom the 

ERIC 13 



- 9 - 



Table 2 



Summary Statistics for Number of Items Administered and Percentage 
of Students for Whoa a High-Confidence Classification Was Made 
by the AMT Procedure Using a Variable Test Length 









Number 


of Items 


Percentage of 


Content 


Number of 








Standard 


High-Conf idence 


Area 


Students 


Mean 


Min. 


Max. 


Deviation 


Classifications 


1 


236 


5.15 


2 


20 


4.18 


98.3 


2 


236 


5.23 


1 


20 


4.47 


98.7 


3 


236 


3.57 


2 


20 


1.98 


99.6 


A 


224 


3.85 


2 


20 


2.33 


99.6 


5 


224 


2.30 


1 


20 


2.54 


99.6 



AMT procedure was able to make a confident classification in 20 items or less 
ranged from 98.3% for Content Area 1, to 99.6% for Content Areas 3, 4, and 5. 
These results indicate that less than 2% of the students needed a test of more 
than 20 items for the adaptive procedure to make a confident classification in 
any content area. Appendix Table F shows the percentage of students for whom 
the AMT procedure reached its termination criterion for each test length within 
each content area. In each content area the same general pattern of results was 
observed. The great majority (more than 70%) of the students reached the termi- 
nation criterion with the administration of 1 to 5 items. The remaining stu- 
dents were fairly evenly divided among the longer test lengths of from 6 to 20 
items* 

Information . Figure 2 shows, for each of the five content areas, the in- 
formation functions that were observed for the conventional test, the adaptive 
test wif'i a fixed length (20 items), and the adaptive test with a variable- 
lengt tnation condition. Numerical values from which these figures were 

obtaiu - * shown in Appendix Tables G, H f and I. Mean information for the 
adaptive tests was plotted as a function of the final achievement level estimate 
obtained using that strategy. The values on the abscissa represent achievement 
level estimates grouped in intervals with a range of +.1 around the plotted 
achievement level- For the conventional tests, theoretical test information 
functions are plotted. (Dotted lines in these figures indicate regions of the 6 
continuum for which no data values were available for that strategy) 

In each content area the adaptive test with 20 items resulted in more 
achievement level estimates with higher levels of information than either of the 
other two strategies, except near the cutoff level between mastery and nonmas- 
tery, at which the conventional test provided slightly more information. For 
each subtest the conventional test provided maximum information very close to 
that subtest* s mastery cutoff score. This was as expected, since the conven- 
tional tests were developed by selecting those 20 items that provide the most 
information at the mastery cutoff, thereby concentrating the test's efficiency 
near one point. Except for being slightly less efficient than the conventional 
test at the mastery cutoff, the adaptive strategy with a 20-item termination 
provided more precise estimates than the conventional strategy, particularly at 
the lower end of the achievement continuum. 



14 



- 10 - 



Figure 2 

Test Information for Conventional Test 
and Fixed- and Variable-Length Adaptive Mastery Tests 
as a Function of Estimated Achievement Level 



(a) Content Area 1 (b) Content Area 2 

21- 
20 - 





(e) Content Area 5 




Conventional Test 

AMT Variable Termination 

AMT Fixed Termination 



- 11 - 



For Subtests 1 and 2 the conventional test provided higher mean information 
values than the variable-termination adaptive strategy at all points along the 
achievement continuum. For Subtests 3, 4, and 5, the conventional test and var- 
iable-termination adaptive testing strategy fluctuated as to which provided more 
information. Generally, the variable-termination adaptive strategy provided 
more information than the conventional test at the lower portion of the achieve- 
ment continuum, while the conventional test provided more information at higher 
achievement levels. It was shown above, though, that the variable-length adap- 
tive testing procedure resulted in tests that were much shorter (2 to 5 items, 
on the average) than the conventional test (20 items). The higher information 
levels obtained from the conventional test are, at least partly, a function of 
the difference in test lengths. The variable-termination adaptive testing 
strategy provided less information at each achievement level than its 20-item 
counterpart because it usually consisted of far fewer items. 

It should be noted that the information curves for the adaptive subtests 
were computed by determining the mean information for students whose achievement 
level estimates fell within certain ranges of the achievement continuum. The 
conventional test information functions are theoretical and are evaluated at 
each point along the achievement continuum* Thus, some differences noted be- 
tween the adaptive tests and the conventional tests may be a function of both 
the curve-smoothing procedure used with the adaptive tests, and the differences 
between use of estimated versus "true" achievement levels. 

Comparison of Test Validities 

Subtest validities . Table 3 shows the phi correlations between each indi- 
vidual's mastery status (master - 1; nonmaster ■ 0) as estimated from the exper- 
imental subtests given in each content area and as observed in classroom perfor- 
mance. These correlations were calculated for each of the four testing strate- 
gies. All of the coefficients observed were significantly different from "zero 
(£ < .05) except for the correlations for the Content Area 5 conventional test 
scorad using the Bayesian scoring system (£ ■ .066) and the Content Area 5 vari- 
able length adaptive test (also £- .066). Coefficients ranged from .102 for 
the variable length adaptive test for Content Area 5 to .393 for the fixed 
length adaptive test for Content Area 1, indicating low to moderate validity for 
each of the content area subtests as predictors of the global classroom mastery 
criterion. For Content Areas 2 and 4 the AMT procedure with variable termina- 
tion produced the highest correlations of the four testing methods between esti- 
mated and criterion mastery status (£ * .324 and .391, respectively), although 
the variable-termination procedure administered only about one-quarter as many 
iten»£ as the other procedures. For Content Areas 1 and 3 the AMT procedure with 
fixed test length resulted in the highest correlations (£ ■ .393 and .388, re- 
spectively). For Content Area 5 the conventional test with proportion-correct 
scoring resulted in the highest correlation (£ - .167). None of the correlation 
coefficients within any one content area differed significantly from one anoth- 
er. 

Table 4 shows the percentage of total correct and incorrect masteiry classi- 
fications and the percentage of correct and incorrect mastery and nonmastery 
classifications made by each testing strategy within each content area. Table 4 
shows that for all content areas the vast majority of classifications made by 
each testing strategy were nonmastery classifications. Performance of the stu- 



er|c 



16 



- 12 - 



Table 3 

Phi Correlation (r) Between the Criterion Mastery Status 
and Estimated Mastery Status, and Number of Subjects (N) 
Within Each Content Area for Each Testing Strategy 



Testing Strategy and Score 





Conventional 


AMT 






Proportion Variable 




Fixed 


Content 


Correct Bayesian Termination 




Length 


Area 


N r £* N r £* N r £* 


N 





1 


214 


.313 


<.001 


214 


.332 


<.001 


213 


.371 


<.001 


213 


.393 


<.001 


? 


214 


.313 


<.001 


214 


.211 


.001 


213 


.324 


<.001 


213 


.281 


<.001 


3 


214 


.218 


.001 


214 


.118 


.043 


213 


.208 


.001 


213 


.238 


<.001 


4 


209 


.226 


<.001 


209 


.239 


<.001 


219 


.391 


<.001 


210 


.388 


<.001 


5 


209 


.167 


.008 


209 


.105 


.066 


219 


.102 


.066 


219 


.156 


.010 



♦Probability of rejecting null hypothesis of zero correlation. 

dents on the ch ssroom mastery criterion resulted In 49.0% of the students In 
the total testing sample attaining mastery status, leaving 51. 0Z of the students 
with nonmastery status. For the experimental subtests, though, the percentage 
of students estimated to have achieved mastery status (averaged across the five 
content areas, weighted by sample size) was 9.0% for the proportion-correct 
scoring of the conventional subtests, 6.1% for the Bayesian scoring of the con- 
ventional subtests, 11.8% for the adaptive subtests with variable test length, 
and 9.5% for the adaptive subtests with fixed test length. These low percent- 
ages, however, may be due to an artifact of the methodology used for this study. 
As noted above, students were given their experimental tests in the weeks Imme- 
diately before their classroom exams. It is quite reasonable to assume that the 
students had not yet studied for their exam when these tests were given and 
therefore were functioning at a lower achievement level then they finally demon- 
strated in their classroom performance. Although this may affect the absolute 
performance levels of the students, it should have no effect on the relative 
performance of the testing strategies. 

The lowest total error rate (total Incorrect classifications) observed in 
Table * is 32.9%, for the AMT procedure with a fixed test length for Content 
Area 1. The highest total error rate observed was 51 .4%, for the conventional 
procedure with Bayesian scoring for Content Area 3. Across content areas, the 
conventional test with prw^ irtion-correct scoring made 457 incorrect classifica- 
tions out of 1,060 total classifications (43.1% Incorrect claasif icationo). The 
conventional testing procedure with Bayesian scoring resulted in 482 Incorrect 
classifications out of 1,060 total classifications (45.5% incorrect classifica- 
tions). The AMT procedure with variable termination resulted in 416 incorrect 
classifications out of 1,077 total classifications (38.6% Incorrect classifica- 
tions). Finally, the AMT procedure with fixed test length made 421 incorrect 
clacsif ications Out of 1,077 total classifications (39.1% incorrect classifica- 
tions). Since the five content areas differed in terms of difficulty, content, 
and contribution to the classroom mastery criterion, no single content area sub- 
test was expected to adequately predict the global classroom performance crite- 
rion. Consequently, the error rates observed for the various testing strategies 



ERLC 



- 13 - 



Table 4 

Percentage of Correct and Incorrect Mastery and Nonmastery 
Classifications Made by Each Testing Strategy Within Each Content Area 



Testing Strategy and Score 



Content Area Conventional AMT 

and Proportion Variable Fixed 

Classification Correct Bayesian Termination Length 



Content Area 1 



correct non^nastery 


45 3 


45 8 


50.7 


51.6 


Incorrect «on-nascery 


Al 1 


41 1 

Hl*l 


30.5 


31.0 


Correct Mastery 


1 9 A 


1 £ • O 


16 0 

ID * V 


15.5 

A. J • -J 


incorrect nascery 


Q 


• v 


2.8 


1.9 


Tocai correct 


57 0 


5ft A 


66.7 


67.1 


rotai incorrect 


A? n 


Al f% 


33 3 


32.9 


Content Area 2 








52.6 


Correct won^iascery 


A2 1 


44 4 


52.1 


Incorrect Non-wastery 


.O 


LL L 

HH * H 




38 0 


Correct Mastery 


1 Q *> 

1 7 • Z 


0 ^ 


. lit j 
" 5 1.4 


8.5 

O * -J 


incorrect nastei,/ 


A 2 


1.9 


# 9 


Total correct 


01 • J 


5^ 7 


w j • t 


61.1 


Total Incorrect 




AA % 


^6 6 


1ft. Q 


Content Area 3 








53.5 


correct won^iastety 




45.8 


53.1 


Incorrect Non— Mastery 






Al ft 


41.8 


correct was eery 


A 5 


7 ft 


4.7 


4.7 


Incorrect Mastery 




♦ j 


.5 


0 


Total Correct 


52.3 


48.6 


57.8 


58.2 


Total Incorrect 


47.7 


51.4 


42.3 


41.8 


Content Area 4 








50.2 


Correct Non-Mastery 


53.1 


53.1 


48.9 


Incorrect Non-Mastery 


42.6 


42.1 


31.5 


34.2 


Correct Mastery 


4.3 


4.8 


17.4 


14.6 


Incorrect Mastery 


0 


0 


2.3 


.9 


Total Correct 


57.4 


57.9 


66.3 


64.8 


Total Incorrect 


42.6 


42.1 


33.8 


35.1 


Content Area 5 










Correct Non-Mastery 


53.1 


53.1 


50.2 


51.1 


Incorrect Non-Mastery 


44.5 


45.9 


46.1 


46.6 


Correct Mastery 


2.4 


1.0 


2.7 


2.3 


Incorrect Mastery 


0 


0 


.9 


0 


Total Correct 


55.5 


54.1 


52.9 


53.4 


Total Incorrect 


44.5 


45.9 


47.0 


46.6 



within content areas were rather high, as expected. 

p Discriminant function analysis within testing sessions . Table 5 shows the 

percentages of incorrect classifications made using discriminant function analy- 
sis within the development samples from each testing session and for each test- 
ing strategy. For Testing Session 1, subtest classifications from Content Areas 
l t 2> and 3 were used in the discriminant function to predict each student's 

ERIC 13 



Table 5 

Percentage of Incorrect Mastery Classifications Made by the Discriminant 
Function of Content Area Mastery Classifications from Each Testing Strategy 
During Each Testing Session, for the Development Group (N-100) 

Testing Session 1 Testing Session 2 

Incorrect Incorrect 
Testing Strategy Non- Incorrect Total Non- Incorrect Total 

and Score Mastery Mastery Incorrect Mastery Mastery Incorrect 



Conventional 



Proportion Correct 


28 


3 


31 


40 


0 


40 




Eayesian Score 




35 


1 


36 


41 


0 


41 




AMT 


















Variable Termination 


24 


2 


26 


29 


4 


33 




Fixed Length 




27 


1 


28 


30 


2 


32 










Table 


6 










Percentage of Incorrect Mastery Classifications Made by the 


Discriminant Function 




of Content Area Mastery Classifications from Each Testing Strategy, During Each Testing 


Session, for the Cross-Validation Group, and 


the Number 


of Students (N) 


in Each Group 






Testing Session 


1 




Testing Session 2 








Incorrect 








Incorrect 






Testing Strategy 




Non- 


Incorrect 


Total 




Non- 


Incorrect 


Total 


and Score 


N 


Mastery 


Mastery 


Incorrect 


N 


Mastery 


Mastery 


Incorrect 


Conventional 
















42.2 


Proportion Correct 


114 


28.1 


7.9 


36.0 


109 


42.2 


0.0 


Bayesian Score 


114 


36.0 


4.4 


40.4 


109 


43.1 


0.0 


43.1 


AMT 














.8 


34.4 


Variable Termination 


113 


29.2 


5.3 


34.5 


119 


33.6 


Fined Length 


113 


31.0 


4.4 


35.4 


119 


36.1 


0.0 


36.1 


J J 



- 15 - 



classroom mastery status. For Testing Session 2, subtest classifications from 
Content Areas 4 and 5 were used as predictors. The coefficients used for each 
discriminant function are shown in Appendix Table J. From Table 5, it may be 
seen that fhe total error percentages for Testing Session 1 ranged from 26% for 
the AMT procedure with variable termination to 36% for the conventional test 
with Bayesian scoring. For Testing Session 2 the total error percentages ranged 
from 32% for the AMT procedure with fixed test length, to 41% for the conven- 
tional test with Bayesian scoring. For both testing sessions, the two AMT pro- 
cedures each resulted in lower error rates than either of the conventional test 
strategies. The two AMT procedures resulted in very similar total error per- 
centages across both testing sessions. 

Table 6 shows the error percentages that resulted when the discriminant 
functions were applied to classify the remainder of the testing sample for each 
testing strategy, for both testing sessions. The AMT procedure with variable 
termination resulted in the lowest total error rates (34.5% in Session 1 and 
34.4% in Session 2). The AMT procedure with fixed test length resulted in the 
second lowest total error rat^s (35.4% in Session 1 and 34.4% in Session 2). 
The conventional test witn proportion-correct scoring gave the third lowest er- 
ror rates (36.0% in Session 1 and 42.2% in Session 2). The highest total error 
rates noted for the cross-validation group were observed for the use of the con- 
ventional test with Bayesian scoring (40.4% in Session 1 and 43.1% in Session 
2). As in the development group, the two AMT procedures differed very little in 
terms of total error rates for the cross-validation groups. The differences in 
total error percentages for the two AMT procedures were .9 percentage points for 
Session 1 and 1.7 percentage points for Session 2. 

Discriminant functions across testing sessions . Table 7 shows the percent- 
age of incorrect decisions made by the discriminant functions developed for each 
testing strategy from the mastery classifications made in each of the five con- 
tent areas, for individuals who were administered the same type of test during 
both testing sessions. The discriminant function coefficients used to make 
these mastery classifications are also shown in Appendix Table J. Table 7 shows 
that the percentage of incorrect nonmastery classifications (false nonmastery) 
was much higher than the percentages of incorrect mastery classifications. This 
trend was earlier observed for the other discriminant function analyses. In 



Table 7 

Percentage of Incorrect Mastery Classifications Made by the 
Discriminant Function of Content Area Mastery Classifications 
from Each Testing Strategy, for Students Who Took the 
Same Type of Test during Both Testing Sessions (N-89) 



Testing Strategy 
and Score 



Incorrect Incorrect Total 
Nonmastery Mastery Incorrect 



Conventional 



Proportion Correct 
Bayesian Score 



25.8 
37.1 



5.6 
0 



31.4 

37a 



AMT 



27.0 
27.0 



Variable Termination 
Fixed Length 



23.6 
24.7 



3.4 

2.3 



ERLC 



20 



- 16 



examining the total percentage of incorrect classifications made by the discrim- 
inant functions for each testing strategy, the trends noted in the earlier anal- 
yses are seen quite clearly. The lowest total percentage of incorrect classifi- 
cations observed was 27. 0%, for both of the AMT procedures. The conventional 
test strategy with proportion-correct scoring mlsclassif ied 1.16 times as many 
students as either of the AMT procedures (31.4% of the students) , while the con- 
ventional test strategy with Bayesian scoring misclassi f ied one-third more stu- 
dents than either AMT procedure (37.1% of the students). 

Discussion and Conclusion 

Two major conclusions result from this study: 

1. In each of the discriminant analyses and in the majority of the individ- 
ual subtest comparisons, the adaptive testing procedure, with either a 
fixed or variable test length, resulted in a consistently higher propor- 
tion of correct classifications concerning mastery status then did the 
conventional testing procedure with either scoring strategy when class- 
room performance was used as a criterion measure. 

2. The variable test length condition used with the adaptive testing proce- 
dure resulted in test lengths that were, on the average, about 80% 
shorter then the fixed test length, but no consistent differences in 
criterion-related validity were found between the adaptive testing pro- 
cedures that used fixeo test length and variable test length. 

Although these conclusions appear to contradict previous psychometric 
theory— that the single most useful type of test to uso when making mastery 
classifications for a group of people is a test that concentrates its measure- 
ment precision within the immediate neighborhood of the mastery cutoff level 
(Birnbaum, 1968, pp. 450) — they really serve as an adjunct to previous findings. 
Birnbaum 1 s demonstration of the superiority of the peaked test dealt with a sin- 
gle test administered to a group of students, The AMT strategy implemented in 
this study administers different tests to different individuals within a group 
of students, depending on the individuals' responses to the test questions. 
Thus, the use of the AMT strategy allows for an entire class of mastery tests to 
be used to make mastery classifications. One member of this class is the best 
peaked test that can be constructed from the item pool for each individual. In 
fact, in the analysis of test overlap, it was found that for students whose 
achievement level estimates were quite close to the mastery cutoff, the AMT pro- 
cedure administered tests that had, on the average, 80% to 90% of the items that 
appeared on the conventional peaked test. However, when a student's achievement 
level differed from the mastery cutoff level, the AMT procedure tended to admin- 
ister tests that had fewer items in common with the bust peaked test. 

This process of giving tests adapted to different individuals has the ef- 
fect of increasing the variance of the observed achievement level estimates, 
thus making differences between students (or between a single student and the 
mastery criterion) more obvious. In this study, for each of the five content 
£*rea subtests the adaptive testing procedure (with either the fixed or variable 
test lengths) resulted in greater score variance than did the conventional tot- 
ing procedure when Bayesian scoring was used to equate scoring methods. The 
mean score variance observed for the conventional test with Bayesian scoring 



■ erJc 



21 



(across content areas) was .237; while for the AMT procedure with variable test 
length, the mean score variance was .359; and for the AMT procedure with fixed 
test length, the mean score variance was .506. Thus, the AMT procedure, with or 
without the variable test length, spread out student achievement level estimates 
and allowed a more accurate assessment of student mastery status. 

This study has thus demonstrated that the AMT procedure resulted in con- 
sistently more accurate estimation of students* mastery status within a course 
of instruction than did the best available conventional test peaked at the mas- 
tery level. Further, it was shown that the use of the AMT f s variable termina- 
tion capability did not significantly reduce the validity of the mastery level 
estimates obtained, while it reduced the mean test length by approximately 80%. 
It is interesting that these findings were noted even for proportion-correct 
scoring of the conventional test, which had Us scoring method in common with 
the criterion measure. This common scoring method may explain the observation 
that the proportion-correct scoring of the conventional subtests resulted in 
slightly higher percentages of correct mastery classifications (using classroom 
performance as a criterion) than did Bayesian scoring of the same tests, in most 
of the subtest comparicons and in all of the discriminant analyses. 

The variable termination AMT procedure has been shown here to be an effi- 
cient way of reducing test length while producing mastery classifications of 
comparable or higher quality than those made by conventional mastery tests con- 
structed to maximize accuracy of mastery classifications. Given the prolifera- 
tion of microcomputers in instructional setting, the AMT procedure should find 
application in many large-scale instructional settings in which conventional 
mastery testing is currently being used. 



1 



- 18 - 



References 



Bejar, I. I., & Weiss, D. J. Computer programs for scoring test data with item 
characteristic curve models (Research Report 79-1 )• Minneapolis: Universi- 
ty of Minnesota, Department of Psychology, Psychometric Methods Program, 
February 1979. 

Bejar, I. I., Weiss, D. J • , & Kingsbury, G. G. Calibration of an item pool for 
the adaptive measurement of achievement (Research Report 77-5). Minneapo- 
lis: University of Minnesota, Department of Psychology, Psychometric Meth- 
ods Program, September 1977. 

Birnbaum, A* Some latent trait models and their use in inferring an examinee's 
ability. In F. M. Lord & M. R. Novick, Statistical theories of mental test 
scores . Reading, MA: Addi son-Wesley, 1968. 

Kingsbury, G. G. , & Weiss, D. J. An adaptive testing strategy for mastery deci- 
sions (Research Report 79-5). Minneapolis: University of Minnesota, De- 
partment of Psychology, Psychometric Methods Program, September 1979. 

Kingsbury, G. G. , & Weiss, D. J. A comparison of ICC-based adaptive mastery 
testing and the Waldian probability ratio method. In D. J. Weiss (Ed*)> 
Proceedings of the 1979 Computerized Adaptive Testing Conference . Minneapo- 
lis: University of Minnesota, Department of Psychology, Psychometric Meth- 
ods Program, Computerized Adaptive Testing Laboratory, 1980. 

Lord, F. M. Application of item response theory to practical testing problems * 
Hillsdale, NJ: Erlbaum, 1980. 

Lord, F. M., & Novick, M. R. Statistical theory of mental test scores . Reading, 
MA: Addison-Wesley, 1968. 

Owen, R. J. A Bayesian approach to tailored testing (Research Bulletin 69-92). 
Princeton, NJ: Educational Testing Service, December 1969. 

Tatsuoka, M. M. Multivariate analysis: Techniques for educational and psycho- 
logical research . New York: Wiley, 1971. 

Urry, V. W. Ancillary estimators for the item parameters of mental test models. 
In W. A. Gorham (Chair), Computers and testing: Steps toward the inevitable 
conquest (PS-76-1). Washington, DC: U.S. Civil Service Commission, Person- 
nel Research and Development Center, September 1976. (NflS No. PB-261-694) 

Wald, A. Sequential analysis . New York: Wiley, 1947. 



23 



- 19 - 



Appendix: Supplementary Tables 



Table A 

Item Numbers and Estimates of Item Discrimination (a), Item 
Difficulty (b), and Lower Asymptote (c) for Each Item Used 
in the Adaptive Testing Pool for the Chemistry Content Area, 
and Items Comprising the Conventional Test 



Item 
Number 


£ 


b 


£ 


Item 
Number 


a 


b 


£ 


3000 


1.76 


.87 


.37 


3052 


.95 


.18 


.49 


3003 


1.47 


-1.66 


.32 


3053 


1.08 


1.32 


.49 


3005* 


1.49 


-.26 


.26 


3054 


1.78 


-.71 


.34 


3008 


1.36 


-1.45 


.30 


3055 


2.36 


-.60 


.23 


3009 


2.21 


-.82 


.16 


3056 


1.30 


1.12 


.43 


3010 


1.05 


.44 


.51 


3057 


1.50 


-1.10 


.28 


3011 


1.60 


-.68 


.27 


3058 


.92 


-.93 


.30 


3012 


1.26 


.66 


.37 


3060 


1.20 


-1.16 


.26 


3013 


1.39 


-1.12 


.29 


3061 


.99 


1.69 


.35 


3014 


1.35 


-.85 


.24 


3062* 


2.22 


.49 


.35 


3016 


1.39 


-1.41 


.49 


3064 


1.12 


.93 


.30 


3018 


.80 


1.02 


.42 


3065 


1.66 


-1.57 


.36 


3019* 


1.55 


.33 


.32 


3066 


1.31 


.63 


.38 


3020 


1.61 


-1 .09 


.27 


3067* 


1.46 


-.29 


.32 


3022 


.77 


-.66 


.15 


3069 


1.00 


-.18 


.44 


3025 


1.03 


-1.67 


.43 


3070 


1.07 


-1.19 


.23 


3028 


1.72 


-1.02 


.26 


3072* 


1.56 


.64 


.38 


3031* 


1.54 


-.56 


.30 


3073 


1.54 


-1.26 


.36 


3032 


1.29 


-1.04 


.35 


3075 


.97 


-.55 


.49 


3033 


2.38 


2.66 


.63 


3078 


1.85 


-1.50 


.29 


3034* 


1.34 


.40 


.38 


3082 


1.02 


1.93 


.51 


3036* 


1.42 


-.57 


.37 


3083* 


1.43 


-.60 


.30 


3038 


1.98 


-.78 


.28 


3084* 


1.73 


-.55 


.43 


3041* 


1.94 


.36 


.42 


3085 


1.70 


-1.70 


.42 


3042* 


1.52 


0 


.27 


3086 


1.04 


-.46 


.44 


3044 


1.13 


-1.19 


.23 


3087 


.93 


-1.27 


.23 


3045 


3.00 


2.70 


.65 


3088 


1.05 


-.21 


.48 


3046* 


1.42 


.24 


.30 


3089* 


1.67 


.63 


.39 


3047* 


2.11 


.39 


.31 


3090 


2.05 


-1.56 


.32 


3048 


1.76 


.77 


.40 


3092* 


1.50 


-.05 


.43 


3049* 


1.30 


-.36 


.38 


3095 


1.46 


-1.03 


.20 


3050* 


2.14 


.64 


.40 


3096 


2.34 


-1.35 


.27 


3051* 


2.21 


.27 


.35 


3097* 


1.86 


.77 


.33 



*Item was administered on the conventional test. 



ERIC 24 



- 20 - 



Table B 

Item Numbers and Estimates of Item Discrimination (a), Item 
Difficulty (b), and Lower Asymptote (O for Each Item Used 
in the Adaptive Testing Pool for the Cell Content Area, 
and Items Comprising the Conventional Test 



Item 
Number 


a 

— — 


b 


£ 


Item 
Number 


a 


b 


c 


3201 


1.46 


-1.16 


.31 


3245 


2.08 




-.87 


.34 


3202 


2.15 


-.66 


.44 


3247 


2.33 


2.37 


.74 


3205 


1.72 


-1.29 


.28 


3248 


.81 


-.40 


.33 


3206 


1.34 


1.76 


.47 


3249 


1.48 


-1.06 


.32 


3208* 


.80 


-.19 


.21 


3250 


.99 


2.36 


.41 


3209 


2.63 


2.93 


.72 


3252 


1.34 


-1.55 


.47 


3210 


1.74 


-1.31 


.28 


3259 


.95 


.28 


.42 


3211* 


1.48 


.63 


.43 


3260 


1.24 


1.33 


.51 


3212 


1.07 


-.85 


.45 


3261 


1.43 


.59 


.67 


3214* 


1.86 


.24 


.37 


3262 


.98 


.55 


.52 


3216* 


1.82 


-.29 


.32 


3263 


1.15 


2.34 


.60 


3217* 


1.45 


-.20 


.30 


3264 


.83 


.61 


.40 


3218 


1.29 


.76 


.36 


3265 


1.70 


-1.38 


.68 


3219 


2.06 


.94 


.44 


3266 


1.67 


-1.00 


.53 


3221* 


1.99 


-.66 


.27 


3267 


1.82 


-.70 


.40 


3222 


.99 


-.58 


.38 


3268* 


1.80 


.38 


.48 


3223 


1.44 


-1.54 


.59 


3270 


.94 


-.21 


.43 


3224* 


1.14 


-.07 


.42 


3271 


1.58 


1.84 


.57 


3226* 


1.50 


-.22 


.37 


3272* 


2.26 


-.48 


.54 


3227 


.97 


-1.04 


.37 


3274* 


1.63 


-.21 


.55 


3228 


1.27 


2.78 


.54 


3276 


1.11 


.29 


.51 


3229 


.80 


.60 


.38 


3282* 


2.15 


-.13 


.35 


3230 


1.47 


2.46 


.62 


3284* 


1.17 


.12 


.51 


3232 


1.62 


.99 


.71 


3285* 


1.37 


-.22 


.31 


3234 


2.53 


3.01 


.59 


3286 


.91 


-1.19 


.51 


3235 


1.97 


-1.24 


.26 


3287 


1.63 


-.93 


.38 


3236* 


1.35 


.03 


.46 


3289 


2.36 


-1.21 


.66 


3237 


2.75 


-.76 


.22 


3290* 


2.42 


-.29 


.37 


3238* 


1.39 


-.76 


.30 


3291 


.80 


.21 


.34 


3240* 


1.54 


.39 


.46 


3292 


2.16 


1.60 


.69 


3241 


1.84 


1.68 


.49 


3293 


1.58 


-1.02 


.41 


3243 


.84 


-.56 


.40 


3294* 


1.19 


-.69 


.30 


3244* 


1.79 


-.28 


.32 











*Item was administered on the conventional test. 



2D 



- 21 - 



Table C 

Item Numbers and Estimates of Item Discrimination (a), Item 
Difficulty (]>), and Lover Asymptote (c) for Each Item Used 
in the Adaptive Testing Pool for the Energy Content Area, 
and Items Comprising the Conventional Test 



Item 
Number 




b 




A I CIH 

Numhfsr 

nunuc l 




b 


c 


3401 


• 94 


1.10 


.43 


3453* 


2.C4 


.68 


•44 


3402 


1*68 


2.17 


.55 


3454 


1.39 


2.39 


• 51 


3403* 


2.77 


.06 


29 


3455 


1.86 


-.73 


• 37 


3404 


1.9? 


-.83 


.59 


3456 


1.30 


2.73 


• 48 


3405* 


1.30 


.52 


.35 


3457 


1.23 


1.65 


• 34 


3406 


1.42 


2.42 


•48 


1458 

JH JO 


2.27 


-1.09 


.42 


3407 


1.39 


2.22 


•49 


3459 


1.33 


-.49 


.32 


3408* 


2.55 


.74 


• 25 


3460 


2.56 


1.3,9 


.34 


3409 


2.46 


2.91 


• 71 


3461 


l . lft 


1 .06 


.49 


3410 


2.01 


1.41 


41 

. H J 


3462 


2.09 


-.69 


.50 


3412 


1.46 


-.82 


.44 


3463 


2.93 


-1 .58 


.50 


3413* 


2,22 


.60 


.52 


3464 


2.82 


-.08 


•32 


1414 


1.66 


2.10 


.50 


3465 


1.93 


1 .18 


.62 


1A1 5 


4 11 

H . L J 


-2.27 


12 


3466 


2 41 

* . H J 


-.12 


.47 


3416 


1.49 


1.24 


.52 


3467 


1.77 


-.44 


• 48 


3418* 


1.46 


.68 


• 49 


3468 


1.43 


.96 


.58 


1419 


2.20 


1.49 


•42 


3469 


2.38 


-.95 


•62 


3420 


1.10 


1.62 


47 

• H / 


3470 


2 45 

* . H J 


-.68 


.38 


3421 


1.22 


07 


• 32 


3471 


1.35 


-.17 


•48 


3423* 


1 .38 


.79 


• 50 


3472 


2. 19 


-1.36 


• 64 


1424 


1.68 


-.19 


.59 
• j ^ 


3473 


1.09 


.02 


•46 


1A2 5 


4.03 


- AO 


.00 


3474 


2 <*0 


2 .01 


.62 


JHtO 


2.28 


- 05 
. \j j 


.49 


3475 


1.30 


.07 


.50 


1A27 


1.36 


1.84 


.39 


3476* 


1.96 


.12 


• 49 


1428 

JH AO 


2.64 


-1 .44 


• 64 


3477 


2. 18 


-.80 


•60 


1A29* 

JH * 7 


*» . O J 


.92 


• 33 


3478 


1.63 


2.01 


.63 


1A11 

t»H J L 


1 14 
& . j~ 


01 

• V J 


19 
. j j 


3479 


1.62 


.05 


.55 


1A12 


2 .1ft 

£ . JO 


-.46 


41 

. H J 


lAftO 

JHOV 


l li 


-1.23 


.58 


1411 

JH J J 


1 2ft 


1 2ft 


.37 


!Aft2 

JHO£ 


1 .36 


• 12 


.55 


JH JH 


.67 


.62 


.37 


3483 


3.89 


1.14 


.43 


3435 


2.07 


-.49 


.68 


3484* 


1.86 


.21 


.60 


3436* 


1.74 


1.10 


.38 


3485 


3.26 


-1.19 


.26 


3437* 


2.73 


.45 


.22 


3486 


1.79 


1.42 


.55 


3438* 


1.34 


• 16 


.36 


3487 


2.29 


1.64 


.63 


3439* 


2.34 


.28 


.34 


3488 


1.93 


-.08 


.54 


3440 


1.78 


1.93 


• 38 


3489 


2.65 


-.99 


.40 


3441 


1.31 


.19 


.61 


3490 


1.40 


-i.39 


.55 


3443 


2.89 


-1.33 


.78 


3491 


.93 


.24 


.51 


3444* 


1.47 


.60 


.40 


3492* 


2.62 


.28 


.50 


3445* 


2.04 


.36 


.40 


3493 


3.34 


-1.85 


.22 


3447* 


1.26 


.97 


.38 


3494 


3.27 


-1.57 


.19 


3448* 


1.69 


.64 


.32 


3495* 


2.07 


• 81 


.53 


3449 


2.73 


2.29 


.48 


3496 


2.54 


-1.26 


.34 


3452 


.86 


2.24 


.39 











*Itea was administered on the conventional test. 



9 

ERJC 



23 



Table D 

Item Numbers and Estimates of Item Discrimination (a), Item 
Difficulty (b), and Lower Asymptote (c) for Each Item Used 
in the Adaptive Testing Pool for the Genetics Content Area, 
and Items Comprising the Conventional Test 



Item 

Number 


a 


b 


c 


Item 
Number 


a 


b 


c_ 


3601 


1.08 


1.30 


.41 


3666 


.70 


1.21 


.27 


3602 


1.15 


-1.29 


.54 


3668 


1.16 


-.74 


.17 


3603* 


1.29 


.41 


.28 


3669* 


1.89 


.22 


.18 


3606 


.77 


-.27 


.13 


3671 


1.49 


-.23 


.22 


3609 


.89 


.18 


.43 


3673 


1.44 


1.36 


.33 


3610 


.98 


-1.10 


.17 


3674* 


1.66 


.66 


.28 


3611* 


1.34 


.26 


.29 


3675* 


1.30 


.48 


.33 


3614 


.66 


.36 


.35 


3679 


1.42 


-.89 


.27 


3615* 


1.74 


1.12 


.30 


3680 


1.59 


-.85 


.21 


3616 


.99 


1.06 


.41 


3683 


.94 


-1.22 


.18 


3617 


.99 


-.99 


.23 


3684 


.90 


-.69 


.21 


3618 


.98 


.13 


.41 


3685 


1.25 


-.98 


.18 


3620 


1.92 


2.83 


.66 


3692 


1.38 


-.98 


.33 


3621 


.98 


-.66 


.16 


3693 


1.46 


-.18 


.33 


3622 


1.14 


2.60 


.51 


3695 


1.23 


-1.31 


.32 


3623* 


1.54 


.74 


.32 


3696 


.83 


-.51 


.14 


3625 


1.15 


2.11 


.50 


3698 


2.27 


2.45 


.60 


3627* 


i.21 


.32 


.37 


3699 


.65 


.52 


.36 


3628* 


1.17 


.46 


.27 


3700 


1.10 


1.03 


.35 


3630 


.68 


-.52 


.38 


3701 


.95 


-.74 


.27 


3631 


1.73 


-.86 


.28 


3703 


1.08 


-.70 


.27 


3632* 


1.39 


.16 


.36 


3704 


1.59 


-1.06 


.30 


3633 


.99 


-.98 


.30 


3707* 


1.89 


.48 


.29 


363^ 


.66 


.72 


.38 


3708 


1.57 


-.20 


.16 


3636 


1.17 


-.49 


.17 


3709 


1.29 


.25 


.36 


3637 


1.22 


-.62 


.18 


3710 


1.16 


-.63 


.20 


3638 


1.70 


-1.42 


.34 


3711 


1.31 


-.82 


.30 


3640 


1.42 


-.67 


.40 


37.2 


.84 


1.89 


.37 


3641 


1.21 


-.61 


.23 


3713 


.74 


-.91 


.42 


3642 


1.06 


1.17 


.26 


3715 


1.37 


-1.50 


.34 


3646 


1.28 


.89 


.37 


3716 


',.29 


1.27 


.35 


3648 


1.89 


-1.08 


.32 


3717 


.90 


1.25 


.41 


3649 


1.14 


-.03 


.21 


3718 


1.03 


.12 


.31 


3651 


1.14 


2.18 


.53 


3719* 


1.10 


.49 


.24 


3654* 


1.83 


.94 


.26 


3720* 


1.48 


.18 


.26 


3656 


.67 


-.40 


.32 


3721 


1.53 


-1.05 


.29 


3657 


.87 


-1.67 


.38 


3728 


1.09 


2.87 


.52 


3658* 


1.31 


.36 


.40 


3733 


1.37 


1.26 


.39 


3661* 


1.68 


.29 


.25 


3735 


1.42 


-1.03 


.22 


3662* 


1.10 


.64 


.17 


3745* 


2.01 


-.10 


.17 


3663 


.72 


-.10 


.36 


3746* 


1.88 


.32 


.25 


3665* 


1.43 


.87 


.33 


3751 


.85 


2.02 


.41 



*Item was administered on the conventional test. 



- 23 - 



Table £ 

Itea Nuabers and Estimates of Item Discrimination (£), Item 
Difficulty (b), and Lower Aayaptote (c) for Each Itea Used 
In the Adaptive Testing Pool for the Reproduction/Embryology 
Content Area, and Iteae Coaprlalog the Conventional Tests 



Itea 
Nuaber 


£ 


b 


£ 


Itea 
Nuaber 


a 


b^ 




3804 


1.90 


1.71 


• 50 


- 

3902 


. .°2 


1.74 


• 40 


3806* 


2.28 


.30 


.34 


3903 


1.30 


■ .76 


.30 


3807 


3.01 


-1 .04 


.18 


3904 


2.64 


2.68 


• 54 


3812* 


1.18 


-.05 


.36 


3905* 


2.07 


.69 


.43 


3813 


1.69 


-.76 


.40 


3906* 


1 .08 


— • 5 J 


.21 


3814 


1.64 


-.47 


.44 


3907 


2.40 


-1 ,06 


.68 


3815* 


1.47 


.56 


•44 


3908* 


1.69 


.14 


.39 


3817* 


1 .12 


-.07 


.47 


3909* 


1.58 


1.04 


.48 


3819* 


1.47 


.54 


.49 


3910 


2.47 


-1 .47 


.43 


3820* 


1.30 


.52 


.26 


3912* 


1.41 


1.02 


.41 


3825 


1 .98 


-1.17 


• 36 


3913 


2.41 


-1 .05 


.25 


3830 


4.13 


1.52 


• 11 


3914* 


1.79 


_ AT 

-.07 


1 A 
• JO 


3832 


1.75 


-1.51 


.38 


3915 


2.53 


-.33 


.24 


3833 


3.10 


2.29 


.40 


3918* , 


i.Al 


.63 


.44 


3834 


1.74 


-1.28 


.77 


3919 


2.41 


-.49 


.49 


3835 


1.40 


2.03 


.57 


3920 


2.05 


-1.01 


.53 


3837 


1.60 


-.79 


.59 


3921 


1.85 


1.52 


.53 


3838 


2.28 


-1.36 


.61 


3922* 


1.52 


.38 


.53 


3841 


1.20 


2.23 


.50 


3923* 


1.41 


.61 


.52 


3847 


1.36 


-.27 


.55 


3,24* 


1.88 


-.18 


.54 


3850 


1.79 


1.41 


.58 


3925* 


1.68 


.74 


.46 


3851* 


1.02 


.19 


.33 


3926 


1.67 


-1.08 


.36 


3852 


.99 


-1.59 


.49 


3927 


1.71 


-1.51 


.40 


3853* 


1.30 


.34 


.37 


3928 


1.45 


-.96 


.34 


3854* 


1.36 


-.47 


.32 


3929 


3.43 


1.36 


.10 


3901 


2.34 


2.59 


.'i2 











*Itea wa*s administered on the conventional test* 



Table F 

Percentage of Students for Whoa the 
Adaptive Testing Procedure Terminated 
at Each Test Length Within Bach Content Area 



Nuaber 



of Items 
Administered 






Content 


Area 




1 


2 


3 


4 


5 


1 


0.0 


24.2 


0.0 


0.0 


62.5 


2 


32*6 


0.0 


31.4 


37.5 


0.0 


3 


19.9 


23.7 


25.0 


12.1 


21.4 


4 


13.6 


f4.C 


26.3 


23.7 


5.4 


5 


.8 


7.6 


9.7 


12.9 


4.5 


6 


6.8 


.8 


3.0 


4.0 


1.8 


7 


6.4 


5.5 


.8 


3.1 


2.2 


8 


2.5 


3.4 


.4 


2.7 


.4 


9 


1.3 


3.4 


1.7 


1.3 


0.0 


10 


3.0 


3.8 


.4 


.9 


0.0 


11 


4.2 


1.3 


.4 


.4 


0.0 


12 


1.3 


4.7 


0.0 


0.0 


.4 


13 


.8 


2.1 


.4 


* .4 


0.0 


14 


2.1 


0.0 


0.0 


.4 


0.0 


15 


1.3 


1.3 


0.0 


0.0 


0.0 


16 


.4 


.4 


0.0 


0.0 


.4 


17 


.4 


1.3 


0.0 


0.0 


0.0 


18 


.8 


.8 


0.0 


0.0 


0.0 


19 


0.0 


0.0 


0.0 


0.0 


.4 


^0 


1.7 


1.7 


.4 


.4 


.4 



0 

ERJC 



- 24 - 



Table G 

Mean and Standard Deviation (SD) of Information Obtained 
From Variable-Length Adaptive Teste in Each Content 
Area aa a Function of Estimated Achievement Level 



Estimated Content Area 

Achievement 1 2 3 4 5 

Level N Mean SD N Mean SD N Mean SD N Mean SD N Mean SD 



— 1 Q 


















o 






o 






-1.7 


0 






0 






0 




0 






0 






-1.5 


0 






0 






0 




0 






0 






-1.3 


0 






0 






0 




0 






0 






-1.1 


0 






42 


.79 


.00 


0 




0 






0 






-.9 


77 


• 84 


.00 


8 


4.25 


.00 


0 




52 


1.30 


.00 


0 






-.7 


20 


2.68 


1.43 


61 


1.20 


1.45 


31 


8.28 .00 


23 


2.61 


.00 


114 


.86 


.36 


-.5 


50 


4.30 


1.65 


52 


4.55 


3.15 


105 


4.02 5.48 


3 


4.90 


.78 


57 


3.32 


1.34 


-.3 


32 


6.07 


1.53 


23 


9.19 


4.75 


19 


7.49 *.99 


74 


4.U4 


2.67 


19 


7.27 


1.58 


-.1 


0 






6 


14.03 


.90 


9 


11.69 1.05 


7 


12.82 


1.70 


0 






.1 


2 


12.46 


.40 


3 


13.28 


1.13 


3 


17.64 3.01 


3 


15.45 


.35 


2 


11.86 


.43 


.3 


10 


9.86 


1.12 


5 


9.25 


1.99 


11 


10.66 2.36 


7 


11.22 


2.36 


3 


7.76 


1.30 


.5 


6 


9.76 


.15 


10 


5.87 


.60 


0 




2 


7.54 


.00 


1 


5.09 


.00 


.7 


23 


7.54 


.41 


7 


4.56 


.14 


40 


6.72 .00 


10 


4.71 


.15 


0 






.9 


0 






4 


4.51 


.00 


0 




0 






6 


3. 20 


.00 


1.1 


0 






0 






0 




0 






0 






1.3 

1.5 


17 


1.65 


.00 


16 


2.04 


.00 


0 




0 






0 






0 






0 






19 








— o- 






1.7 


0 






0 






0 




0 






24 


4.38 


.00 


1.9 


0 






0 






0 




0 






0 







Table H 

Mean and Standat I deviation (SD) of Information Obtained 
From 20~It daptive Teats in Each Content Area 
aa a Fur n of Estimated Achievement Level 



Estimated Content Area 



hlevement 




1 






4 






5 






4 






5 




Level 


N 


Mean 


SD 


N 


Mean 


SD 


N 


Mean 


SD 


N 


Mean 


SD 


N 


Mean 


SD 


-1.9 


2 


5.11 


.70 


2 


2.13 


1.21 


3 


10.82 


.81 


1 


3.14 


.00 


2 


2.71 


^05 




14 


9.32 


1.06 


8 


4.14 


.60 


9 


13.41 


.90 


4 


4.85 


.70 


11 


4.34 


.77 




31 


13.13 


.83 


12 


7.00 


4.02 


12 


16.31 


.67 


11 


7.19 


.93 


22 


8.33 


1.18 




21 


15.79 


.70 


18 


10.31 


.94 


26 


18.00 


.49 


16 


10.30 


.82 


45 


12.64 


1.35 




17 


16.03 


1.86 


29 


13.74 


1.09 


29 


18.52 


.77 


22 


12.28 


.78 


47 


16.66 


1.02 




28 


17.54 


1.60 


23 


16.23 


1.05 


29 


18.69 


.94 


22 


13.35 


1.08 


23 


16.72 


1.28 




28 


17.85 


1.54 


35 


17.91 


1.20 


36 


21.47 


1.26 


25 


13.95 


1.09 


26 


14.9b 


.79 




22 


17.23 


1.42 


36 


17.30 


1.39 


26 


23.57 


1.21 


26 


13.12 


1.02 


9 


13.27 


.97 




11 


14.12 


1.53 


22 


16.83 


1.05 


18 


20.91 


1.85 


27 


13.36 


.76 


16 


13.05 


.87 




14 


12.93 


1.30 


19 


14.75 


1.77 


4 


19.09 


.44 


18 


14.29 


.58 


6 


12.17 


.88 




6 


12.11 


.95 


12 


12.74 


1.26 


10 


19.27 


1.52 


13 


15.15 


.87 


5 


10.98 


1.43 




8 


13.89 


.86 


9 


10.88 


.96 


13 


20.65 


.52 


15 


15.67 


.66 


2 


10.62 


.44 




4 


16.64 


.42 


3 


8.80 


1.05 


8 


21.09 


.37 


7 


15.68 


.60 


5 


10.68 


.77 




8 


16.62 


.37 


3 


8.14 


.44 


4 


19.73 


.99 


8 


14.93 


.61 


2 


10.16 


.87 




7 


14.49 


.96 


1 


7.50 


.00 


3 


18. '0 


1.15 


4 


14.65 


.52 


2 


9.61 


1.16 




7 


11.58 


.53 


3 


6.47 


.14 


3 


19.01 


.42 


1 


12.95 


.00 


2 


11.74 


.84 




3 


8.66 


1.36 


1 


5.97 


.00 


0 






2 


10.91 


.56 


0 








4 


5.78 


•41 


1 


5.39 


.00 


0 






2 


9.65 


.51 


0 








0 






0 






2 


11.55 


.99 


1 


7.57 


.00 


1 


19.93 


.00 




2 


3.30 


.22 


0 






2 


7.96 


3.58 


1 


5.35 


.00 


0 







23 



- 25 - 



Table I 

Theoretical Teat Information for Conventional 
Teata in Each Content Area aa a 
Function of Achievement Lave I 



Achievement 
Level 




Content Area 






1 


2 


3 


4 


5 


-1.9 


.07 


.15 


.00 


.01 


.08 


-1.7 


.17 


.30 


.00 


.02 


.14 


-1.5 


.39 


.60 


.01 


.04 


.26 


-1.3 


.85 


1.20 


.02 


.10 


.47 


-1.1 


1.71 


2.32 


.04 


.24 


♦ 80 


-.9 


3.04 


4.24 


.11 


.59 


1.33 


-.7 


4.75 


7.12 


.29 


1.37 


2.14 


-•5 


6.50 


10.63 


.82 


2.92 


3.34 


-.3 


8.10 


13.55 


2.39 


5.47 


4.97 


-.1 


9.79 


14.46 


5.95 


8.80 


6.91 


• 1 


12.00 


13.39 


11.28 


12.24 


8.91 


• 3 


14.40 


11.40 


16.47 


14.89 


10.54 


• 5 


15.54 


9.19 


19.65 


16.06 


11.28 


• 7 


14.38 


7.02 


20.32 * 


15.66 


10.99 


• 9 


11.42 


507 


18.64 


14.08 


9.79 


1.1 


8.00 


3.49 


14.50 


11.74 


7.99 


1.3 


5.13 


2.32 


9.63 


9.06 


6.04 


1.5 


4.01 


1.50 


5.89 


6.49 


4.30 


1.7" 


1.82 ' 


-.96" 


~ 3.-ST- 


*V3T" 




1.9 


1.06 


.62 


2.07 


2.84 


1.92 



Table J 

Development Group Discriminant Function Weights and Constants 
Used to Estimate Classroom Maatery Statua from Maatery Status 

Estimated from Each Content Area Teat during Each Testing 
Session and Across Testing Sessions for Each Testing Procedure 



Testing Session Content Area 



and Procedure 


1 


2 


3 


4 


5 


Constant 


Testing Session 1 (N-100) 














Convent ions 1 














Proportion Correct 


1.86 


1.75 


1.45 






-.54 


Bayesiaa 


2.80 


1.48 


1.30 






-.40 


AMT 












-.61 


Variable Termination 


1.89 


1,88 


.21 






Fixed Length 


2.26 


1.46 


.26 






-.55 


Testing Session 2 (N-100) 














Conventional 














Proportion Correct 








5.92 


5.92 


-.18 


Bayesian 








7.17 


.00 


-.14 


AMT 












-.60 


Variable Termination 








2.50 


,40 


Fixed Length 








2.55 


1.56 


-.53 


Both Sessions (N-89) 














Conventional 












-.69 


Proportion Correct 


1.95 


1.41 


1.49 


-.23 


-.08 


Bayeaian 


2.68 


-.06 


-1.96 


2.22 


-1.96 


-.49 


Adaptive 










-.83 


.76 


Variable Length 


1.20 


.70 


-.87 


2.05 


Fixed Length 


1.63 


.75 


-1.41 


2.11 


.30 


-.72 



30 



Distribution List 



Nevy 



1 Dr. Alveh littner 

Navel Biodvnaaics Laboratory 
New Orleans, Lous i en e 70189 

1 Dr. Jack R. Bnrsting 
Provost A Academic Dean 
U.S. Naval Postgraduate School 
. Monterey, CA 939*0 

1 Dr. Robert Breeux 
Coda N-711 
NAVTRAEQUIPCEfl 
Orlando, FL 32813 

1 Chlaf of Naval Education and Training 
Llaaon Offlea 
Air Force Hum an He sour ce Uboratory 
Flying Training Division 
WILLIAMS AFB, AZ 8522* 

1 CD* Mike Curran 

Office of Naval Research 

800 U. Quincy St. 
Code 270 

Arlington, YA 22217 

1 Dr. Richard El star 

Department of Administrstive Sciences 
Naval Postgraduate School 
Monterey. CA 939*0 

1 DR. PAT FEDERICO 

NAVY PERSONNEL RAD CEHTER 
SAN DIEGO, CA 92152 

1 Mr. Paul Foley 

Na vy Per sonnel RAD Center 

San Dino7~CTWT52 

1 Dr. John Ford 

Navy Personnel RAD Center 
San Diego. CA 92152 

1 Dr. Henry M. Halff 

Department of Psychology,': -00 9 
University of California at San Diego 
La Jolle, CA 92093 



1 Dr. Patrick R. Harrison 
Psychology Course Director 
LEADERSHIP A LAW DEFT. (7b) 
DIV. OF PROFESSIONAL DEVELOPMENT 
U.S. NAVAL ACADEMY 
ANNAPOLIS, MD 21*02 

1 CD* Charles W. Hutchins 

Naval Air Systems Command Hq 

AIR-3*0F 

Navy Department 

Washington, DC 20361 

1 CDR Robert S. Kennedy 

Head , Hue) an Performance Sciences 
Navel Aerospace Medical Research Lab 
lot 29*07 

New Orleans, U 70185 

1 Dr. Nora an J. Kerr 

Chief of Naval Technical Training 
Naval Air Station Memphis (75) 
Mil Una, ton. TV 3805* 

1 Dr. William L. Maloy 

Principal Civilian Advisor for 

Education and Training 
Naval Training, Commend, Coda 00A 
Pensecola, FL 32508 



1 Dr, Kneel e Marehall 

Scientific Advisor to TCWW) 
0PO1T 

Waahington DC 20370 

1 CAPT Richard L. Martin, USN 
Proepective Go— and lag Officer 
USS Carl Vinson 'CW-TO) 
Newport News 9> Iding and Drydock Co 
Newport Haws, .. ->u7 

1 Dr, Jamee Mc Bride 

Navy Personnel RAD Cantor 
San Diego. CA 92152 

1 Tad M. I. Yellan 

Technical Information Office, Code 201 
NAVY PERSONNEL RAD CENTER 
SAN DIEGO, CA 92152 

1 Library, Coda P201L 

Navy Per aorm el RAD Center 
San Diego, CA 92152 

6 Commanding Officer 

Neva! Resesrch Lsborstory 
Coda 2627 

Washington, DC 20390 

1 Psychologist 

0NR Brench Office 
Bldg 111, Saction D 
666 Summer Street 
Boston, MA 02210 

1 Psychologist 

OMR Branch Office 
536 S. Clerk Street 
Chicago, IL 60605 

1 Office of Naval Research 
Code «37 

800 N. Quincy S3 treat 
Arlington, VA 22217 

5 Personnel A Training Resesrch Programs 
(Coda 458) 
Office of Naval Research 

Arlington, VA 22217 

1 Psychologist 

0NR Brench Office 
1030 Esst Green Street 
Pasadens, CA 91101 

1 Office of the Chief of Naval Operations 
Resesrch Development A Studiee Branch 

(0P-115) 
Waahington, DC 20350 

1 LT Frank C. Petho, MSC, USN (Ph.D) 

Selection and Training Research Division 
Hunan Performance Sciences Dept. 
Navel Aerospace Medical Resesrch Lsborst 
Pensecola, FL 32508 

1 Dr. Bernard Riajland (03B) 
Navy Peraonnel RAD Center 
San Dlago. CA 92152 

1 Dr, Worth Scanland, Director 

Research, Development , Test A Evsiuetion 
N-5 

Naval Education and Training Commend 
MAS. Pensecola, FL 32508 



1 Dr. Robert G. aiith 

Office Of Chief of Navel Operetiono 
0P-987H 

Washington, DC 20350 

1 Dr. Alfred F. Saode 

Training Anelysis A Evelustlon Group 

(TAEG) 
Dept. of the Navy 
Orlando, FL 32813 

1 Dr. Richard 5or*nsen 

Navy Personnel R&D Center 
Ssn Diego, CA 92152 

1 Dr. Roneld Weltaan 

Code 5* WZ 

Department of Administrstive Sciences 
U. S. Revel Postgraduate School 
Monterey, CA 939*0 

1 Dr. Robert Wisher 

Code 309 

Nevy Personnel RAD Center 
San Diego, CA 92152 

1 DR. MARTIN F. WISK0FF 

NAVY PERSONNEL RA D CENTER 
SAN DIEGO, CA 92152 

Army 



1 Technicel Director 

U. S. Army Resesrch Institute for the 
Behavioral and Social Sciences 

5001 Eisenhower Avenue 
Al e x an dria. VA 22333 

1 Dr. Myron Fiachl 

U.S. Amy Research Institute for the 
Sociel and Behavioral Sciences 
5001 Eisenhower Avenue 
Alexendrie. VA 22333 

1 Dr. Dexter Fletcher 

U.S. Army Resesrch Irstitute 
5001 Eisenhower Avenue 
Alexendrie. VA 22333 

1 Dr. Michael Kaplan 

U.S. ARMY RESEARCH INSTITUTE 
5001 EISENHOWER AVENUE 
ALEXANDRIA. VA 22333 

1 Dr. Milton S. Katz 

Training Technicel Aree 
U.S. Army Resesrch Institute 
5001 Elsenhower Avenue 
Alexsndrls, VA 22333 

1 Dr. Harold F. 0'Ncll, Jr. 
Attn; PERI-OK 
Army Resesrch Institute 
5001 Elsenhower Avenue 
Alexendrie. VA 22<»33 

1 DR. JAMES L. RANEY 

U.S. ARMY RESEARCH INSTITUTE 
5001 EISENHOWER AVENUE 
ALEXANDRIA, VA 22333 

1 Mr. Robert Ross 

U.S. Army Resssrch Institute for the 
Sociel and Behavioral Sciences 
5001 Eisenhower Avenue 
Alexendrie. VA 22333 



31 



Or. Robert 3a i 
U. S. iray Research institute for the 

Behavioral and Social Science a 
5001 Eisenhower Avenue 
Alexe;«drie, V* 22333 

Com aid ant 

US Amy Institute of Adainletretlon 

Attn: Or, Sherrill 

FT Benjeain Harrison, III *6256 

Dr. Frederick Stelnheiser 

Dept. of Hevy 

Chief of Navel Operation a 

OP-113 

Washington, DC 20350 

Dr. Joseph Ward 
U.S. Aray teaeerch Inst i tuts 
5001 Eisenhower Avenue 
Alexandria. VA 22333 

Air Fores 



Air Fores Huh en Resources Lsb 
AFHRL/MP0 

Brooke AFB, TX 76235 

Dr. Earl A. Alluisl 
HO, AFHAL < AFX) 
Brooke AFB, TX 78235 

Research and Moeeuraent Dlvieion 
Research Br inch. AFMPC/NPCTPR 
Randolph AFB, TX 781 *B 

Dr. Mai cola Bee 
AFHRL/MP 

Brooks AFB, TX 78235 

Dr. Marty Nock way 

T e ch n ic a l Dir e ct o r — 



AFHRL(OT) 

Williaas AFB, AZ 5822* 
Marines 



1 H. Will lea Greenup 

Education Advisor (E031) 
Education Center, MCDEC 
Quentlco, VA 2213* 

1 Director. Office of Manpower Utilization 
HQ, Marine Corps (MPU) 
BCB, Bids. 2000 
Quantico, VA 22131 

1 Major Michael L. Petrow, USMC 

Haedquartere, Marine Corps * 
(Cods MPI-20) 
Washington. DC 20380 

1 DR. A.L. SLAFKOSKY 

SCIENTIFIC ADVISOR (CODE RD-1 > 
HQ. U.S, MARINE CORPS 
WASHINGTON, DC 20380 

CoestGusrd 



Mr. Thoaee A. Went 
U. S. Coest Guerd Institute 
P. 0. Subststion 18 
Oklahoae City, OK 73169 



Other DoD 



12 Dafenae Technical Infometion Center 
Caaeron Station r Bids 5 
Alexendrie, VA 223m 
Attn: TC 1 

1 Dr. Williaa Grahaa 
Teeting Directorete 
MEPCQM/MEPCT-P 
Ft. Sheridan, IL 60037 

1 

1 Military Aaeletent for Trelning end 

Personnel Technology 
Office of the Uhder Secretary of Def 

for Research A Engineering 
Room 3D129. The Pentagon t 
Washington. DC 20301 

1 Dr. Wsyns Sella an 

Office of the Assistant Secretary 
of Defenaa (MBA A L) 1 
2B269 The Pentagon 
Washington. DC 20301 

1 DARPA 

1*00 Wilson Blvd. 1 
Arlington, VA 22209 v 

Civil Govt 



Dr. Andrew R, Molnsr 
Science Educetlon Dev. 

and Research 
Netionel Science Foundation 
Washington, DC 20550 

Dr. yarn W. UTry 

Personnel RAD Center 

Office of Personnel Menegaeent 

1000 E Stree t WW 

Washington. DC 20*15 

Dr. Joseph L. Young, Dirsctor 
Meaory A Cognitive Processes 
Nstlonsl Sclsncs Foundation 
Washington, DC 20550 

Hon Govt 



Dr. Erling B. Andersen 
Departaent of Stetistics 
Studleetrsedc 6 
1*55 Copenhagen 
DENMARK 

1 psychologicsl research unit 
Dept. of Defense (Aray Office) 
Caapbell Park Offices 
Canberre ACT 2600, Austrslis 

Dr. Xsssc Bejer 
Educational Testing Service 
Princeton. UJ 08*50 

Capt, J. Jaen Bel anger 
Training Deveiopaent Dlvieion 
Canadian Forcee Training Syetea 
CFTSHQ. CFB Trenton 
Aatre. Ontario KOK 1B0 

CDR Robert J. Blersnsr 
Prograa Manager 
Hub an Perforaanca 
Navy Medical RAD Coaaend 
Be the ad a. HD 2001* 



Dr. Me nucha Blrenbeua 

School of Education 

Tel Aviv Uhlvereity 

Tel Aviv. Raaet Aviv 69978 

Ierael 

Dr. Werner Birke 

DezWPs la Strsitkrseftssat 

Poatfsch 20 50 03 

D-5300 Bonn 2 

WEST GERMANY 

Liaison Scisntists 
Office of Navel Research. 
Branch Office , London 
Box 39 FPO New York 09510 

Col Ray Bowlee 

800 N. Quincy St. 
Rooa 80S 

Arlington, VA 22217 

Dr. Robert Brennan 

Aaerican Collage Teeting Prograae 

P. 0. Box 168 

Iowa City, IA 522*0 

DR. C. VICTOR BUNDER SOS 
WICAT INC. 

UNIVERSITY PLAZA. SUITE 10 
1160 SO. STATE ST. 
OR EM, UT 8*057 

Dr. John B. Carroll 
Psychoastric Lsb 
Univ. of No. Ceroline 
Davis Hall 013A 
Chapel Hill. NC 2751* 

Charlae Myere Library 
Living* tone House 
Livingstons Rosd 

Stratford 



London E15 2U 
ENGLAND 

Dr. Kenneth t. Clerk 
College of Arte A Sciences 
Unlvereity of Rocheeter 
River Caapua Station 
Rocheeter. NY 1*627 

Dr. Nora an Cliff 
Dept. of Psychology 
Univ. of So. California 
Unlvereity Park 
Los Angslss. CA 90007 

Dr. Williaa E. Coffaan 
Director, Iowa Teeting Prograas 

33* Llndquiet Center 
Unlvereity of Iowa 
Iowa City, IA 522*2 

Dr. Meredith p. Crewford 
Aaerican Peychologicel Association 
1200 17th Street, N.W. 
Weshington, DC 20036 

Dr., Frits Dresgow 

Ysle School of Orgenixstion and Hanaiea 
Yale Unlvereity 
Box 1A 

New Haven, CT 06520 

Dr. Mavln D. Dunne tte 

Personnel Decieione Reaeerch Institute 

2*15 Foshey Tower 

821 Merguette Avenue 

Mlnssnolis, MH 55*02 



32 

0 

ERIC 



1 Nik* Durmeyer 

Instructional rro«rsm Devslopment 

Building 90 

NET-PDCD 

Great Usee NTC. IL 60088 

1 EtIC Facility acquisition* 
*833 Nugby Avenue 
Bstheade. ffD 2001* 

1 Or. Benjamin A. Feirbenk, Jr. 
HcFsnn-Crsy * Assoc is tes. Inc. 
5825 Csllaghan 
Suite 225 

San Antonio, Texas 78228 

1 Dr. Leonard Feldt 

Lindquist Center fo> Mseeurment 
University of lows 
low. City. IA 522*2 

1 Dr. Richard L. Ferguson 

The American College Teating Progrs 

P.O. Bo* 168 

Iowa City. I A 522*0 

i Dr. Victor Field* 
Dept. of Psychology 
Hon t« entry College 
Rookville. HD 20850 

1 Univ. Prof. Dr. Gerhard Fischer 
Llebiggesse 5/3 
A 10 1 0 Vlenne 
AUSTRIA 

1 Professor Doneld Fitzgersld 
University of New Englsnd 
Areidaie. new South wales 2351 
AUSTRALIA 

„ 1 Dr. Edwin A. Fleishman 



Advanced Research Resources Organ. 
Suite 900 

*330 East Meat Highway 
washing W. DC 2001* 

i Dr. John R. Frederikeen 
Bolt Beranek A Newaan 

50 Haul ton Street 
C« bridge. MA 02138 

1 DR. ROBERT GLASER 
LRDC 

UNIVERSITY OF PITTSBURGH 
3939 0*HARA STREET 
PITTSBURGH, PA 15213 

i Dr. Bert Green 

Johns Hopkins University 
Department of Psychology 
Char lee a 3*tn Street 
Baltimore, MD 21218 

1 Dr. Ron Hsmbleton 
School of Education 
University of Massachusetts 
Amherst. MA 01002 

1 Dr. Chester Herrie 
School of Education 
University of California 
Santa Earbere. CA 93106 

1 Dr. Uoyd Huaphreye 

Department of Paychology 
University of Illinois 
Chaapaign. IL 61320 

1 Library 

HumRRO/weetern Division 
27657 Berwick Drivs 
Camel, CA 93921 



1 Dr. Steven Hunks 

Department of Education 
University of Alberta 
Edmonton. Alberta 
CANADA 

1 Dr. Earl Hunt 

Dept. of Paychology 
University of Waehlngton 
Seattle, WA 98105 

1 Dr. Hjynh Huynh 

Collage of Education 
University of South Carolina 
Coluabis. SC 29208 

1 frofessor John A. testa 
Univsrsity of aswcsstls 

AUSTRALIA 2308 

1 Nr. Narlin Krogar 
1117 Via Golsts 

Paloe Vardce EstsUs, CA 9027* 

1 Dr. Michael Levine 

Departsient of fiducetionel Paychology 
210 Educstion Bldg. 
University of Illinois 
Champaign* IL 61801 

1 Dr. Charles Lewie 

Feculteit Sociele We tsnsc happen 
Rijkeunivereiteit Groningen 
Ouds Boteringeetreet 23 
9712CC Groningen 
Netherlands 

1 Dr. Robert Linn 

College of Educstion 
Univsrsity of Illinois 
Urbsns, IL 61801 



1 Dr. Frederick H. Lord 

Eduaattonel Testing Service 
Princeton. NJ 085*0 

1 Dr. Gary Marco 

Educational Testing Service 
Princeton, RJ 08*50 

1 Dr. Scott Hex wall 

Department of Paychology 
University of Houston 
Houston, IX 7700* 

1 Dr. Samuel T* Mayo 

Loyola 'Jnivereity of Chicago 
820 Berth Michigan Avenue 
Chicago. IL 60611 

1 Professor Jason Hillman 
Department of Educstion 
Stone Hsll 
Cornsll Univsrsity 
Ithacs, NY 1*853 

1 Bill No rd brock 

Instructionsl Program Development 

Building 90 

atT-PDCD 

Great Lakee NTC, a 60088 

1 Dr. Malvln R. aovlck 

356 Lindoulst Center for Heasurment 
University of Iowa 
Xowa City, IA 52242 

1 Dr. Jesse Orlsnsky 

Inatituta for Defsnse Anelysss 
400 Army Navy Drive 
Arlington, VA 22202 



1 Wayne M, Patlenca 

Am er loan Council on Educstion 
CCD Tsstlng Servioe, Suite 20 
0ns Dupont Cirls, NW 
Wsshington. DC 20036 

1 Dr. Jaaae A. Paulson 

Portland Stste University 
P.O. Boi 751 
Portland, 01 97207 

1 MR. LUIGt PCTRULL0 

2*31 N. EDGEW00D STREET 
ARLINGTON. VA 22207 

1 DR. DIANE M. RAMSEY-KLEE 

R-K RESEARCH a SYSTEM DESIGN 
39*7 RIDGCMONT DRIVE 
MALIBU. CA 90265 

1 MINRAT M. L. RAUCH 
P II * 

BUNDESM IMI3TER IUM D€R VERTEIDIGUNG 

P0STFACH 1328 

D-53 BONN 1. GERMANY 

1 Dr. Mark D. Reckase 

fiducetionel Psychology Dept. 
Univsrsity of Missouri -Columbia 
* Hill Hall 
Columbia. MO 65211 

1 Dr. Andrew M. Rose 

American Inetitutee for Research 
1055 Thomaa Jefferson St. m 
Wsshington. DC 20007 

1 Dr. Leonsrd L. Rosenbeum, Chairman 
Department of Paychology 
Montgomery College 
Rockville. MD 20850 



1 D r . E r nst Z. 

Bell Laborstories 
600 Mountain Avenue 
Murrey Hill. NJ 0797* 

1 Dr. Lawrence Rudner 
*03 Elm Avenue 
Tekome Park. MD 20012 

1 Dr. J. Ryan 

Department of Educstion 
University of South Carolina 
Columbia, SC 29208 

1 PROF. FUMIXO SAMEJIMA 
DEPT. OF PSYCHOLOGY 
UNIVERSITY OF TENNESSEE 
KNOXVILLE. TN 37916 

1 DR. ROBERT J. SEIDCL 

INSTRUCTIONAL TECHNOLOGY GROUP 

HUMRR0 
300 N. WASHINGTON ST. 
ALEXANDRIA . VA 2231* 

1 Dr. Kezuo Shigemesu 
University of Tohoku 
Department of Educational Psychology 
Kawauchi. Sandal 980 
JAPAN 

1 Dr. Edwin Shirkey 

Department of Psychology 
Univsrsity of Centrel Florida 
Orlando. FL 32816 

1 Dr. Robert Smith 

Department of Computer Science 

Rutgers University 

New Brunswick. NJ 06903 



9 

ERIC 



33 



1 Dr. Richard Snow 
School of education 
Stanford University 
Stanford, CA 9*305 

1 Dr. Robert Sternberg 
Dept. of Psychology 
Tele University 
fei 11A, Yale Station 
New Haven, CT 06520 

1 DR. PATRICK SUPPES 

INSTITUTE FOR MATHEMATICAL STUDIES IN 

THE SOCIAL SCIENCES 
STANFORD UNIVERSITY 
STANFORD, CA 9*305 

1 Dr. Harlharan Swaminathan 

Laboratory of Paychoaetric and 

Evaluation Research 
School of Education 
University of Massachusetts 
Aaherst. MA 01003 

1 Dr. fc-ed Sywpson 

Psychonetrtc Rt search Group 
Educational Tasting Service 
Princeton, NJ 085*1 

1 Dr. Xikuai Tatsuoka 

Conputer Based Education Research 

Laboratory 
252 Engineering Research Leborstory 

University of Illinois 
Urbane, IL 61801 

1 Dr. Devid Thissen 

Department of Psychology 
University of Kansas 
Lawrence. KS 660 

1 Dr. Robert Tsutalcawa 
~ Oeper taient of Statistics 
University of Missouri 
Columbia, HO 65201 

1 Dr. J. Uhlaner 

Perce ptronlcs. Inc. 
6271 Veriel Avenue 
Woodland Hills, CA 9136* 

1 Dr. Howard vainer 

Division of Psychological Studies 
Educational Testing Service 
Princeton, NJ 085*0 

1 Dr. Phyllis Weaver 

Graduate School of Education 
Harvard University 
200 Larsen Hall, Appian Way 
Cambridge, MA 02 138 

1 Dr. David J. Ueiss 
N660 Elliott Hail 
University of Minnesota 
75 £. River Road 
Minneapolis, MN 55*55 

1 DR. SUSAN a. WHITE LY 
PSYCHOLOGY DEPARTMENT 
UNIVERSITY OF KANSAS 
LAWRENCE, KINSAS 660** 

1 Wolfgang Wildgrube 
Strcitkreeftesnt 

Bos 20 50 03 
D-5300 Bonn 2 



Previous Publications 



Proceedings of the 1977 Computerized Adaptive Testing Conference. 
July 1978. 

Research Reports 

Final Report: Computerized Adaptive Ability Testing. April 1981. 
81-2* Effects of Immediate Feedback and Pacing of Item Presentation on Ability 
Test Performance and Psychological Reactions to Testing. February 
1981. 

81-1* Review of Test Theory and Methods. January 1981. 

80-5. An Alternate-Forms Reliability and Concurrent Validity Comparison of 
Bayesian Adaptive and Conventional Ability Tests* December 1980. 

80-4. A Comparison of Adaptive, Sequential, and Conventional Testing Strategies 
for Mastery Decisions. November 1980. 

80-3* Criterion-Related Validity of Adaptive Testing Strategies. June 1980* 

80-2. Interactive Computer Administration of a Spatial Reasoning Test. April 
1980. 

Final Report: Computerized Adaptive Performance Evaluation. February 
1980. 

80-1* Effects of Immediate Knowledge of Results on Achievement Test Performance 

and Test Dimensionality. January 1980. 
79*7* — The Pe r s o n Response Curve: Fit of Individuals to Item Characteristic 

Curve Models. December 1979. 
79-6* Efficiency of an Adaptive Inter-Subtest Branching Strategy in the 

Measurement of Classroom Achievement* November 1979. 
79-5. An Adaptive Testing Strategy for Mastery Decisions. September 1979. 
79-4. Effect of Point-in-Time in Instruction on the Measurement of Achievement. 

August 1979. 

79-3. Relationships among Achievement Level Estimates from Three Item 

Characteristic Curve Scoring Methods. April 1979. 
Final Report: Bias-Free Computerized Testing. March 1979. 
79-2. Effects of Computerized Adaptive Testing on Black and White Students. 

March 1979. 

79-1. Computer Programs for Scoring Test Data with Item Characteristic Curve 

Models. February 1979. 
78-5* An Item Bias Investigation of a Standardized Aptitude Test. December 

1978. 

78-4. A Construct Validation of Adaptive Achievement Testing. November 1978. 
78-3. A Comparison of Levels and Dimensions of Performance in Black and White 

Groups on Tests of Vocabulary, Mathematics, and Spatial Ability. 

October 1978. 

78-2. The Effects of Knowledge of Results and Test Difficulty on Ability Test 

Performance and Psychological Reactions to Testing. September 1978. 
78-1. A Comparison of the Fairness of Adaptive and Conventional Testing 

Strategies* August 1978. 
77*7. An Information Comparison of Conventional and Adaptive Tests in the 

Measurement of Classroom Achievement. October 1977. 
77-6. An Adaptive Testing Strategy for Achievement Test Batteries. October 

1977. 

77-5. Calibration of an Item Pool for the Adaptive Measurement of Achievement. 
September 1977. 

-continued overleaf- 



mc 



77*4. A Rapid I ten-Search Procedure for Bayesian Adaptive Testing. May 1977. 

77-3. Accuracy of Perceived Test-Item Difficulties. May 1977 

77*2* A Comparison of Information Functions of Multiple-Choice and Free- 
Response Vocabulary Items. April 1977. 

77-1. Applications of Computerized Adaptive Testing. March 1977. 

Final Report: Computerized Ability Testing, 1972-1975. April 1976. 

76-5. Effects of Item Characteristics on Test Fairness. December 1976. 

76-4. Psychological Effects of Immediate Knowledge of Results and Adaptive 
Ability Testing. June 1976. 

76-3. Effects of Immediate Knowledge of Results and Adaptive Testing on Ability 
Test Performance. June 1976. 

76-2. Effects of Time Limits on Test-Taking Behavior. April 1976. 

76-1. Some Properties of a Bayesian Adaptive Ability Testing Strategy. March 
1976. 

75-6. A Simulation Study of Stradaptive Ability Testing. December 1975. 
75*5. Computerized Adaptive Trait Measurement: Problems and Prospects. 
November 1975. 

75-4. A Study of Computer-Administered Stradaptive Ability Testing. October 
1975. 

75-3. Empirical and Simulation Studies of Flexilevel Ability Testing. July 
1975. 

- 75 -2- T ETttEgT: A FORTRAN iv Program for Calculating Tetrachoric Correlations. 
March 1975. 

75-1. An Empirical Comparison of Two-Stage and Pyramidal Adaptive Ability 

Testing. February 1975. 
74-5. Strategies of Adaptive Ability Measurement. December 1974. 
74-4. Simulation Studies of Two-Stage Ability Testing. October 1974. 
74-3. An Empirical Investigation of Computer-Administered Pyramidal Ability 

Testing. July 1974. 
74-2. A Word Knowledge Item Pool for Adaptive Ability Measurement. June 19 7 4. 
74-1. A Computer Software System for Adaptive Ability Measurement. Januaiy 

1974. 

73-4. An Empirical Study of Computer-Administered Two-Stage Ability Testing. 
October 1973. 

73-3. The Stratified Adaptive Computerized Ability Test. September 1973. 
73-2. Comparison of Four Empirical Item Scoring Procedures. August 1973. 
73-1. Ability Measurement: Conventional or Adaptive? February l e ^3. 



Copies of these reports are available, while supplies last, from: 
Computerized Adaptive Testing Laboratory 
N660 Elliott Hall 
University of Minnesota 
75 East River Road 
Minneapolis MN 55455 U.S.A. 




